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SYNTHETIC PLANT GENES AND METHOD FOR PREPARATION 

BACKGROUND OF THE INVENTION 


The present invention relates to genetic engineering and more particularly to plant transformation in 
5 which a plant is transformed to express a heterologous gene. 

Although great progress has been made in recent years with respect to transgenic plants which express 
foreign proteins such as herbicide resistant enzymes and viral coat proteins, very little is known about the 
major factors affecting expression of foreign genes in plants. Several potential factors could be responsible 
in varying degrees for the level of protein expression from a particular coding sequence. The level of a 
to particular mRNA in the cell is certainly a critical factor. 

The potential causes of low steady state levels of mRNA due to the nature of the coding sequence are 
many. First, full length RNA synthesis might not occur at a high frequency. This could, for example, be 
caused by the premature termination of RNA during transcription or due to unexpected mRNA processing 
during transcription. Second, full length RNA could be produced but then processed (splicing. polyA 
75 addition) in the nucleus in a fashion that creates a nonfunctional mRNA. If the RNA is properly synthesized, 
torminated and polyadenylated, it then can move to the cytoplasm lor translation. In the cytoplasm. mRNAs 
have distinct half lives that are determined by their sequences and by the cell type in which they 3ro 
expressed. Some RNAs are very short-lived and some are much more long-lived. In addtion. there is an 
effect, whose magnitude is uncertain, of translational efficiency on mRNA half-life. In addition, every RNA 
20 molecule folds into a particular structure, or perhaps family of sturctures. which is determined by its 
sequence. The particular structure of any RNA might lead to greater or lesser stability m the cytooiasm 
Structure per se is probably also a determinant of mRNA processing m the nucteus. Unfortunately, it is 
impossible to predict, and nearly imoos?'Ne to determine, the structure of any RNA (except for tRNA) m 
vitro or in vivo. However, it is likely that dramatically changing the sequence of an RNA will have a large 
^5 effect on its folded structure. It is likely that structure per se or particular structural features also have a rote 
m determining RNA stability. 

Some particular sequences and signals have been identified m RNAs that have the potential for having 
a specific effect on RNA stability. This section summarizes what is known about these seouencos and 
signals. These identified sequences often are A + T rich, and thus aro more hkoly to occur m an A ♦ T '-rh 
jo coding sequence such as a B.t. gene. Th« sequence motif ATTTA (or AUWUA as it appear « >n pna\ has 
been implicated as a destabilizing sequence in mammalian coll rnRNA (Shaw and Kamen. 1986) No 
analysis of the function of this sequence m plants has been done Many short lived mRNAs have A ♦ T nch 
3 untranslated regions, and these regions often have the ATTTA sequonce. sometimes present m mu'ipio 
copies or as multimers (e g.. ATTTATTTA.. ) Shaw ahd Kamen showed that the transfer of the 3 end ol an 
35 unstable mRNA to a stable RNA (globm or VAi) decreased the stable RNA’s half life dramatically They 
further showed that a pentamer of ATTTA had a profound destabilizing effect on a stable message and that 
this signal could exert its effect whether it was located at the 3 end or within the coding ;oquonco 
However, the number of ATTTA sequences and/or the sequence contoxt m which they occur also appear to 
be important in determining whether they function as destabilizing sequences Shaw and Kamon showed 
40 that a trimer of ATTTA had much less effect than a pentamer on mRNA stability and a dimer or a monomor 
had no effect on stability (Shaw and Kamen. 1907). Note that multimers of ATTTA such as a pentamer 
automatically create an A ♦ T rich region. This was shown to be a cytoplasmic offect. not nuclear in other 
unstable mRNAs, the ATTTA sequence may be present in only a single copy, but it is often contained m an 
A ♦ T rich region. From the animal cell data collected to date, it appears that ATTTA at least m some 
contexts IS important m stability, but it is not yet possible to predict which occurences of ATTTA am 
destaging elements or whether any of these effects are likely to be seen m plants. 

Some studies on mRNA degradation in animal cells also indicate that RNA degradation may begin m 
some cases with nucleolytic attack in A +■ T rich regions. It is not clear if those cleavages occur at ATTTA 
sequences. There are also examples of mRNAs that have differential stability depending on the ceil typo m 
so which they are expressed or on the stage within the cell cycle at which they are expressed For example 
histone mRNAs are stable during ONA synthesis but unstable if DNA synthesis is disrupted The 3 end of 
some histone mRNAs seems to be responsible lor this offect (Pandoy and Marzluff. 1907) it does not 
appear to be mediated by ATTTA. nor *s it ciear what controls the differential stability of ’his mRNA 
Another example is the differential stability of IgG mRNA in B lymphocytes during 9 ceil maturation 
(Genovese and Milcarek. i960). A final example is the instability ol a mutant beta-thailesemic globm mRNA 
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In bone narrow cells, where this gene is normally expressed, the mutant mRNA is unstable, while the wild- 
type mRNA is stable. When the mutant gene is expressed in HeLa or L cells in vitro, the mutant mRNA 
shows no instability (Urn et a!.. 1988). These examples all provide evidence that mRNA stability can be 
mediated by cell type or cell cycle specific factors. Furthermore this type of instability is not yet associated 
5 with specific sequences. Given these uncertainties, it is not possible to predict which RNAs are likely to be 
unstable in a given cell. In addition, even the ATTTA motif may act differentially depending on the nature of 
the cell in which the RNA is present. Shaw and Kamen (1987) have reported that activation of protein kinase 
C can block degradation mediated by ATTTA. 

The addition of a poly adenylate string to the 3 end is common to most eucaryotic mRNAs. both plant 
ro and animal. The currently accepted view of polyA addition is that the nascent transcript extends beyond the 
mature 3 terminus. Contained within this transcript are signals for polyadenylation and proper 3 end 
formation. This processing at the 3 end involves cleavage of the mRNA and addition of polyA to the mature 
3 end. By searching for consensus sequences near the polyA tract in both plant and animal mRNAs. it has 
been possible to identify consensus sequences that apparently are involved in polyA addition and 3 end 
/5 cleavage. The same consensus sequences seem to be important to both of these processes. These signals 
are typically a variation on the sequence AATAAA. In animal cells, some variants of this sequonce that are 
functional have been identified: in plant cells there seems to be an extended range of functional sequences 
(Wickens and Stephenson, 1984; Dean et al., 1986). Because all ol these consensus sequences are 
variations on AATAAA. they all are A + T rich sequences. This sequence is typically found 1 5 to 20 bp 
20 before the polyA tract in a mature mRNA. Experiments in animal cells indicate that this sequence is 
involved in both polyA addition and 3 maturation. Site directed mutations in this sequenco can disrupt 
these functions (Conway and Wickens. 1988; Wickens et al., 1987), However, it has also been observed that 
sequences up to 50 to 100 bp 3 to the putative polyA signal are also required; i.e.. a gone that has a 
normal AATAAA but has been replaced or disrupted downstream does not get properly polyadonylated (Gil 
23 and Proudfoot. 1904; Sadofsky and Alwine, 1984; McDevitt et al., 1984). That is. the polyA signal itself is 
not sufficient for complete and proper processing. It is not yet known what specific downstream sequences 
are required in addition to the polyA signal, or il thero is a specific sequenco that has this function 
Therefore, sequence analysis can only identify potential polyA signals. 

In naturally occurmg mRNAs that are normally polyadonylated. it has been observed that disruption ol 
jo this process, either by altering tho polvA signal or other sequences m the mRNA. profound effects can be 
obtained in the level of functional mRNA. This has been observed m several naturally occurmg mRNAs. with 
results that are gene specific so f ar. There are no general rules that can be derived yot from mo study of 
mutants of these natural genes, and no rules that can bo applied to heterologous genes L3efow am lour 
examples: 

35 1. In a globm qene. absence of a proper polyA site loads to improper termination of transcription ft *s 

likely, but not proven. th3t the improperly terminated RNA is nonlunctional and unstablo (Pmwdfoot et al . 
1907). 

2. In a.globin gene, absence c. a functional polyA signal can :«ad to a I00*fold decrease m the level 
of mRNA accumulation (Proudfoot et al.. 1907). 

jo 3. A globin gene polyA site was placed into the 3 ends ol two different histone genes The histone 

genes contain a secondary structure (stem-loop) near their 3 ends. The amount ol proporly polyadenyiatod 
histone mRNA produced from these chimeras decreased as the distance between the stem-iooo and the 
polyA site increased. Also, the two histone genes produced greatly different levels of proporly poiyadenyiat- 
ed mRNA. This suggests an interaction between the polyA site and other sequences on the mRNA that can 
45 modulate mRNA accumulation (Pandy and Marzluff. 1987). 

4. The soybean leghemoglobin gene has been cloned into HeLa colls, and it has been determined 
that this plant gene contains a "cryptic" polyadenylation signal that is active m animal coils, but is not 
utilized in plant cells. This leads to the production of a now polyadonylated mRNA that is nonfunctional 
This again shows that analysis of a geno in one cell type cannot prodict its behavior in alternative coll typos 
so (Wiebauer et al.. i960). 

From these examples, it is clear that m natural mRNAs proper polyadenylation is important m mRNA 
accumulation, and that disruption of this process can effect mRNA levels sigmlicantly. However, insufficient 
knowledge exists to predict the effect of changes m a normal geno. in a heterologous geno. where we do 
not know if the putative polyA sites (consensus sequences) are functional, it is even harder to predict the 
55 consequences. However, it is possible that the putative sites identified are disfunctional. That is. those sites 
may not act as proper polyA sites, but instead function as aberrant sitos that give rise to unstable mRNAs. 

In animal cell systems. AATAAA is by far the most common signal identified in mRNAs upstream of the 
polyA. but at least four variants have also been found (Wickens and Stephenson. 1904). in plants, not nearly 
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so much r.naiysis has been done, but it is clear that multiple sequences similar to aataaa ran H 
The plant sites below called major or minor refer only to the study ol Oean et al. (1986) which analyzed 
only three types of plant gene. The designation of polyadenylation sites as major or minor refers only to the 
frequency of their occurrence as functional sites in naturally occurring genes that have been analvz^Mn 
the case of plants this is a very limited database, it is hard to p?«Sc« ^^ithly ceSv^ f s ,m 

m ° rd ° r 1938 ' ike,y ,0 ,unc,ion part,al,y or comp,0 ' 0,y " he " ,qq " d a 


AATAAA Major consensus s te 

AATAAT Major plant site 

AACCAA Minor plant site 

ATATAA Minor plant site 

AATCAA Minor plant site 

ATACTA Minor plant site 

ATAAAA Minor plant site 

ATGAAA Minor plant site 

AAGCAT Minor plant site 

ATTAAT Minor plant site 

ATACAT Minor plant site 

AAAAV ^ Minor plant site 

ATTAAA Minor animal site 

AATTAA Minor animal site 

AATACA Minor animal site 

CATAAA Minor animal site 


A „Z h k pr0C0SSt ^ ,hal «*urs ,n tho nucleus ,s mtron splicing. Hearty al. ol the work on 

, 0 ' ' nenrt Tnn V r 9 ,0 an,mal C0 " S - bU ' S ° me '’ a ' a ,S e ™ f 9'"g ,rom intron proross.n.-, 

depends on proper 5 and 3 splice lunction sequences. Consensus sequences lor ihoso ,unctions have 

e,rrs ,o :n°: h an, r and p,am mRNAs ^ ° n,v j ,e - 

luZZnl'LZ l Vt ° P W anv Ce,,a ' mv Wh0,h0r j PU ' a,,VO ■O-’O-'O- .5 lunctional or partly 

Ind ol he nfn solely on sequence analysis in particular. .he only .nvanant nuc.eot.dos are G. a. me 5 

.„ tron or n'.h AG * 3 9nd °' ,h0 ,nlr ° n ln plan,S - a * tfwo, V "°arby position, e.lhor withm tho 

" ® " " ank,n 9 ,h ® ,n,ron all (cur nucleotides can he although some positions show 

some nucleotide preference (Brown. 1986. Hanley and Schuler. 1988) 

A plan, mtron has been moved from a pa.a.m gene into a GUS gene r„ do i»,„. sue dirorted 

nucleotides^!!!! P0r, ° rm0d '° ,n,r0duC8 -estnction sites, and this mutagenesis changed several 
<0 ! ! l "1 0X00 Sequonccs ,,a " k '"9 lh ® Gr a "d AG This mtron still functioned properly 

course m a l rT P ° ml* ' h ° "® mb "" V * ° ,h9f nucleo "de POS.tons. There are of 

must L i r 9 " C9S ° 3nd AG J " g0 " eS ,hal d0 ” 0 ' ,unc,lon as "’"on splice tunct.ons. so thoro 
°‘ h8r Sequenca * S,ruC,r " al 'denlily splice ,uncons In plants, one such feature 

r ' PaS8 C ° mp0Si " 0n per se Wiobauef * al ('988) and Gooda.l o. al (.988) have analyzed 
J5 P ? and 0X °" S and ,0und ,ha ' 0xons havs ' 50% A ♦ T While mtrons have -/0% A * T Goodail o. 

A * r nch I I nl! ^ Pl3n ‘ ,n ' r0n lha ' h0S co,lsensus 5 and 3 splice junctions and a random 

relrld h a r M SP " COd C0m,0 " y ,n p,a '" s W, ’ e " segment was 

? SeQUenC0 ' Sp " C,ng °" ic '° ncy " as drastically reduced These two examples 
havTa orelfd 'T° n r0C09 " °" in plan,s ma V depend on very general features - spl.ee tunct.ons that 
50 SfeuH to n L nf, S0qU0nC0 diV0fS " V 3 " d A * 7 ,,t;hness th ® "’"on ..sell This. o. course, makes It 
ti n p °"' s ® quence alone Wh e- K er any particular sequonco is likely to lunction as an active o. 

partially active intron for RNA processing. 

* . y f- gen©s being A*T rich contain numerous stretches ot various lengths that havo 70*. or greater 
scanned " ^ S,r °' CheS '**'"'"** bv Sequenco ana,ysis Spends on tho length ol sequence 

” utilised « P0 ' yadenyla " cn described above, them are complications m prod.ctmg what sequences might bo 

nathwa < .hll'I ©V" ^ 9 ' V0n 90 " 0 F ' rS '' many " aU,rally °«ur,ng gunos have al.omative splicing 
Hotima ! o' © a “ er " a,,v ® combinations ol exons m tho final inRNA (Gailoga and Nadal-Gmard. t988. 
Heilman and R.cc, .988: Tsurush.ta and Ko.n, 1909). That is. some splico tuncuons are apparently 
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recogn;.ed under some circumstances or in certain cell types, but not in others. The rules governing this 
are not understood. In addition, there can be an interaction between processing paths such that utilization of 
a particular polyadenylation site can interfere with splicing at a nearby splice site and vice versa (Adami and 
Nevins. 1980; Brady and Wold. 1988; Marzluff and Pandey, 1988). Again no predictive rules are available. 
5 Also, sequence changes in a gene can drastically alter the utilization of particular splice junctions. For 
example, in a bovine growth hormone gene, small deletions in an exon a few hundred bases downstream of 
an intron cause the splicing efficiency of the intron to drop from greater than 95% to less than 2% 
(essentially nonfunctional). Other deletions however have essentially no effect (Hampson and Rottman, 
1988). Finally, a variety of in vitro and in vivo experiments indicate that mutations that disrupt normal 
to splicing lead to rapid degradation of the RNA in the nucleus. Splicing is a multistep process in the nucleus 
and mutations in normal splicing can lead to blockades in the process at a variety of steps. Any of these 
blockades can then lead to an abnormal and unstable RNA. Studies of mutants of normally processed 
(polyadenylation and splicing) genes are relevant to the study of heterologous genes such as 8.1 Q.t. genes 
might contain functional signals that lead to the production of aberrant nonfunctional mRNAs. and these 
rs mRNAs are likely to be unstable. But the 8.t. genes are perhaps even more likely to contain signals that are 
analogous to mutant signals in a natural gene. As shown above these mutant signals aro very likolv to 
cause defects in the processing pathways whose consequence is to produce unstable mRNAs 

It is not known with any certainty what signals RNA transcription termination in plant or animal colls 
Some studies on animal genes that indicate that stretches of sequence rich in T cause termination by call 
20 thymus RNA polymerase II in vitro. These studies have shown that the 3 ends of in vitro terminated 
transcripts often lie within runs of T such as T5, T8 or T7. Other identified sites have not boen composed 
solely of T, but have had one or more other nucleotides as well. Termination has been found to occur within 
the sequences TATTTTTT, ATTCTC. TTCTT (Dedrick et al.. 1987; Reines et al.. 1987). In tho case o! these 
latter two. the context in which the sequence is found has been C >T rich as well. It is not known if this is 
25 essential. Other studies have implicated stretches ol A as potential transcriptional terminators. An interesting 
example from SV40 illustrates the uncertainty in defining terminators based on uenco alone Ono 
potential terminator m SV40 was identified as being A rich and having a region of dyad symmetry (potential 
stem-loop) 5 to the A rich stretch. However, a second terminator identified experimentally downstream m 
the same gene was not A rich and included no potential secondary structure (Kessler et al. t9G8) Of 
jo course, due to the A* T content ol B.t. genes, they are rich m runs ol A or T that could act as tormmators 
The importance of termination to stability of the mRNA is shown by the giobm gene example described 
above. Absence of a normal polyA site leads to a failure m propor termination with a consequent decrease 
m^mHNA. 

There is also an otfoct on mRNA stability due the translation of the mRNA Prematuro translational 
35 termination m human trioso phosphate isomerase leads to instability ot tho mRNA (Oaar ot al.. 1988) 
Another example is the beta-thatlesemic giobm mRNA described above that is specifically unstabto m b^ne 
marrow cells (Lim et al.. 1980). The defect m this mutant gene is a single base pair delotion at codon 44 
that leads to translational termination (a nonsense codon) at codon 60 Compared to properly translated 
normal globin mRNA. this mutant RNA is very unstable. These results indicate that an improperly ’'.instated 
4n mRNA is unstable. Other work in yeast indicates that oroper but poor translation can havo an otfoct on 

mRNA levels. A heterologous gene was modified to convert certain codons to more yeast prntorrod codons 
An overall 10-fold increase in protein production was achieved, but there was also about a 3-foid increase m 
mRNA Hoekema et al.. 1987). This indicates that more efficient translation can lead to greater mRNA 
stability, and that the effect of codon usage can be at tho RNA tovol as well as the translational level it <s 
J 5 not clear from codon usago studies which codons lead to poor translation, or how this is coupled to mRNA 

stability. 

Therefore, it is an obiect of the present invention to provido ^ method for proparmg synthetic plant 
genes which express their respective proteins at relatively high levels when compared to wild-typo genes it 
is yet another object of the present invention to provide synthetic plant genes which oxpiess the crystal 
50 protein toxin of Bacillus thuringiensis at relatively high levels. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 illustratos the steps employed m modifying a wild-typo gene to increase oppression 
efficiency m plants. 

Figure 2 illustrates a comparison of the changes in the modified 8.t.k. HD-t sequence of E’■ample i 
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(lower tine) versus the wild-typo sequence of 8.t.k. HD-1 .which encodes the crystal protein toxin (upper 
line). 

Figure 3 illustrates a comparison of the changes in the synthetic B.t.k. HD-1 sequence of Example 2 
(tower line) versus the wild-type sequence of BJ.k. HD-1 which encodes the crystal protein toxin (upper 
s line). 

Figure 4 illustrates a comparison of the changes in the synthetic S. t.k. HD-73 sequence of Example 3 
(lower line) versus the wild-type sequence of BJ.k. HD-73 (upper line). 

Figure 5 represents a plasmid map of intermediate plant transformation vector cassette pMON893. 

Figure 6 represents a plasmid map of intermediate plant transformation vector cassette pMON900. 
to Figure 7 represents a map for the disarmed T-ONA of A tumetaciens AGO. 

Figure 8 illustrates a comparison of the changes in the synthetic truncated B.t.k. HD-73 gene (Ammo 
acids 29-615 with an N-terminal Met-Ala) of Example 3 (lower line) versus the wild-type sequence of B.t.k. 
HD-73 (upper line). 

Figure 9 illustrates a comparison of the changes in the synthetic/wild-type lull length B.t.k. HO-73 
/5 sequence of Example 3 (lower line) versus the wild-type full-length sequence of 8.t.k. HD-73 (upper line). 

Figure 10 illustrates a comparison of the changes in the synthetic/modified full length 8.t.k. HD*73 
sequence of Example 3 (lower line) versus the wild-type full-length sequence of 8.t.k HO-73 (upper line). 

Figure 11 illustrates a comparison of the changes in the fully synthetic luil-iength 8. t.k. HD-73 
sequence of Example 3 (lower line) versus the wild-type full-length sequence of B.t.k. HO-73 (upper line). 

20 Figure 12 illustrates a comparison of the changes m the synthetic 8.t.t. sequence of Example 5 

(lower line) versus the wild-type sequence of fl.f.f. which encodes the crystal protein toxin (upper line). 

Figure 13 illustrates a comparison of the changes in the synthetic B.t. P2 sequence of Example 6 
(lower line) versus the wild-type sequence ol B.t.k. HD-1 which encodes the P2 protein toxin (upper line). 

Figure 14 illustrates a comparison of the changes m the synthetic B.t. entcmocidus sequence of 
25 Example 7 (lower line) versus the wild-type sequence of B.t. entomocidus which encodes the Btent protein 
toxin (upper line). 

Figure 15 illustrates a plasmid map for plant expression cassette vector pMON744 

Figure 16 illustrates a comparison of the changes m Ihe synthotic potato leaf roll virus (PLRV) coat 
protein sequence of Example 9 (lower lino) versus the wild-type coat protom soquenco ol PLRV (upper 
jo line). 


STATEMENT OF THE INVENTION 


75 The present invention provides a method for preparing synthet'C plant genes which genes oppress their 

protom product at levels significantly higher than the wild-type genes which were commonly employed in 
plant transformation heretofore. In another aspect, the present invention also provides novel synthetic plant 
genes which encode non-plant proteins. 

For brevity and clarity ol description, the present invention will bo primarily described with respect to 
the preparation of synthetic plant genes which encode the crystal protein toxin of Bacillus thunngtensis - 
(8.t.). Suitable B.t. subspecies include, but are not limited to. B.t. kurstaki HO-i. B.t . kurstaki HD-73. B.t. 
sotto. B.t. beriiner. B.t. thuringiensis. 81. totworthi. B.t. dendrohmus. B.t. aiestt. d.t. gaiienae. B.t. 
aizawai . B.t. subtoxicus. B.t. entomocidus. B.t teneorionis and B.t san diego. However, those skilled m 
the art will recognize and it should be understood that the presont method may be usod to prepare 
*5 synthetic plant genes which encode non-plant proteins other than tho crystal protom toxin of B.t. as well as 
plant proteins (see for instance. Example 9) 

The expression of B.t. genes in plants is problematic. Although the expression of B t. genes m plants at 
insecticidal levels has been reported, this accomplishment has not boon straightforward in particular, the 
expression of a full-length lepidopteran specific B.t. gone (comprising DNA from a 8 t.k. isoiato) has boon 
so reported to be unsuccessful m yielding insecticidal levels of expression m some plant species (Vaeck et ai . 
1987 and Barton et al . 1 987) 

it has been reported that expression of the full-length gone from B.t.k HD-i was detectable m tomato 
plants but that truncated genes ted to a higher frequency of insecticidal plants with an ovorall higher lovel of 
expression. Truncated genes ol B.t. berliner also led to a higher frequency of insecticidal plants m tobacco 
55 (Vaeck et at.. 1987) On the other hand, insecticidal plants wore provided from lettuce transformants using a 
full-length gene. 

It has also been reported that the full length gone from B.t.k. HD-73 gave some insecticidal effect m 
tobacco (Adang et al.. 1987). However, the B.t. mRNA detected m those plants was only i 7 kb compared 
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to the expected 3.7 kb indicating improper expression of the gene. It was suggested that this truncated 
mRNA was too sitort to encode a functional truncated toxin, but there must have been a low level of longer 
mRNA in some plants or no insecticidal activity would have been observed. Others have reported in a 
publication that they observed a large amount of shorter than expected mRNA from a truncated d.t.k. gene. 
5 but some mRNA of the expected size was also observed. In fact, it was suggested that expression of the 
full length gene is toxic to tobacco callus (Barton et al., 1987). The above illustrates that lepidopteran type 
B,t genes are poorly expressed in plants compared to other chimeric genes previously expressed from the 
same promoter cassettes. 

The expression of fl.f.f. in tomato and potato is at levels similar to that of B.t.k. (i.e., poor). fl.f.f. and 
to B.t.k. genes-share only limited sequence homology, but they share many common features in terms of 
base composition and the presence of particular A ♦ T rich elements. 

All reports in the field have noted the lower than expected expression of fl.f. genes in plants. In general, 
insecticidal efficacy has been measured using insects very sensitive to B.t. toxin such as tobacco 
hornworm. Although it has been possible to obtain plants totally protected against tobacco hornworm. -i is 
/5 important to note that hornworm is up to 500 fold more sensitive to fl.f. toxin than some agronormcaliy 

important insect pests such as beet armyworm. It is therefore of interest to obtain transgenic plants that are 

protected against all important lepidopteran pests (or against Colorado potato beetle in the case of B.t. 
tenebrionis )-and in addition to have a level of B.t. expression that provides an additional safety margin 
over and above the efficacious protection level. It is also important to devise plant genos which function 
20 reproducibly from species to species, so that insect resistant plants can be obtained in a predictable 
fashion. 

in order to achieve these goals, it is important to understand the nature of the poorer than expected 
expression of 8. T. genes in plants. The level of stable B.t mRNA in plants is much lower than expected 
That is * compared to other coding sequences driven by the same promoter, the level of B.t mRNA 
25 measured by Northern analysis or nuclease protection experiments is much lower. For example, tomato 
plant 337 (Fischhoff et al.. 1987) was selected as the best expressing plant with pMON97t t which contains 
the B.t.k. HD-1 Kpnl fragment driven by the CaMV 35S promoter and contains the NOS-NPTII-NOS 

selectable marker gene. In this plant the level of B.t. mRNA is between 100 to 1000 fold lower than tho level 

of NPT1I mRNA. even though the 35S promotor is approximately 50-fold stronger than the NOS promoter 
ic (Sanders et al.. 1987). 

The level of fl.f. toxin protein detected in plants is consistent with the low level of B.t. mRNA Moreover, 
tho insecticidal efficacy of the transgenic plants correlates with the fl.f. protein level indicating that tho to*m 
protein produced in plants is biologically active Therefore. the low lovol of B.t. toxin expression may he the 
result of the low levels of fl.f. mRNA. 

:;s Messenger RNA levels are determined bv tho rate of synthosis .md rale ol degradation, it .s the 
balance between these two that determines the steady state level of mRNA The rate of synthesis has been 
maximized by the use of the CaMV 35S promotor, a strong constitutivo plant oxprossiblo promoter The use 
of other plant promoters such as nopaline synthase (NOS), mar..,opine synthase (MAS) and nbuioso 
bisphosphatecarboxylase small subunit (RUBISCO) have not led to dramatic changes in the levels of B.t . 
40 toxin protein expression indicating that the effects determining B.t. toxin protein levels are piomoter 
independent. These data imply that the coding sequences of DNA genes encoding B.t. toxin proteins are 
somehow responsible for the poor expression level, and that this effect is manifested by a low level of 
accumulated stable mRNA. 

Lower than expected levels of mRNA have been observed with four different lepidopteran specific 
genes (two from B.t.k. HD-t; fl.f. bertiner and fl. t.k. HD*73) as well as the gene from the coieopteran 
specific fl.f. tenebrionis. It appears that for topidoptoran type fl.f. gones those olfects are manifest more 
strongly in the full length coding sequences than in ihe truncated coding sequences. These effects are seen 
across plant species although their* magnitude seems greater m some plant species such as tobacco. 

The nature of the coding sequences of B.t. genos distinguishes thorn from plant genes as well as many 
so other heterologous genes expressed m plants, in particular. B.t. genos are very rich (-62%) m adenine (A) 
and thymine (T) while plant genes and most bacterial genes which have boon expressed in plants are on 
the order of 45*55% A> i. The A + T content of the genomes (and thus the genes) of any organism are 
features of that organism and reflect its evolutionary history. Whilo within any one organism genes have 
similar A ♦ T content, the ♦ T content can vary trornendously from organism to organism. For example. 
55 some 8aciUus species have among the most A + T rich genomes white some Steptotnyces species are 
among the least A + T rich genomes (-30 to 35% A ♦ T) 

Due to the degeneracy of the genetic code and the limited number ol codon choices lor any ammo 
acid, most of the "excess" A ♦ T of the structural coding sequences of some Bacillus species are found m 
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the third position of the codons. That is. genes of some Bacillus species have A or T as the third nucleotide 
in many codons. Thus A ♦ T content in part can determine codon usage bias. In addition, it is clear that 
genes evolve for maximum function in the organism in which they evolve. This means that particular 
nucleotide sequences found in a gene from one organism, where they may play no role except to code for 
s a particular stretch of amino acids, have the potential to be recognized as gene control elements in another 
organism (such as transcriptional promoters or terminators. polyA addition sites, intron splice sites, or 
specific mRNA degradation signals). It is perhaps surprising that such misread signals are not a more 
common feature of heterologous gene expression, but this can be explained in part by the relatively 
homogeneous A>T content (-50%) of many organisms. This A>T content plus the nature of the genetic 
io code put clear constraints on the likliehood of occurence of any particular oligonucleotide sequence. Thus, 
a gene from E. coll with a 50% A + T content is much less likely to contain any particular A * T rich 
segment than a gene from B. thuringlensis. 

As described above, the expression of B.t. toxin protein in plants has been problematic. Although the 
observations made in other systems described above offer the hope of a means to elevate the exprescion 
>5 level of B.t. toxin proteins in plants, the success obtained by the present method is quite unexpected. 
Indeed, inasmuch as it has been recently reported that expression of the full-length B.t.k. toxin protein m 
tobacco makes callus tissue necrotic (8arton et al.. 1987); one would reasonably oxpoct that high level 
expression of B.t. toxin protein to be unattainable due to the reported toxicity effects 

In its most rigorous application, the method of the present invention involves the modification of an 
20 existing structural coding sequence ("structural gene") which codes for a particular protein by romoval of 
ATTTA sequences and putative polyadenylation signals by site directed mutagenesis of the ONA compris¬ 
ing the structural gene. It is most preferred that substantially all the polyadenylation signals and ATTTA 
sequences are removed although enhanced expression levels are observed with only partial removal of 
either of the above identified sequences. Alternately if a synthetic gene is propared which codos tor the 
2 $ expression of the subject protein, codons are selected to avoid the ATTTA sequence and putative 
polyadenylation signals. For purposes of the present invention putative polyadenylation signals nclude. but 
are not necessarily limited to, AATAAA, AATAAT. AACCAA. ATATAA. AATCAA. ATACTA. AT AAAA, 
ATGAAA. AAGCAT. ATTAAT, ATACAT. AAAATA. ATTAAA. AATTAA, AATACA and CATAAA !n replacing 
the ATTTA sequences and polyadenylation signals, codons are preferably utilized which avoir I the codons 
20 which are rarely found m plant genomes. 

Another embodiment of the present invention, represented m the flow diagram ni Figure i. employ3 a 
method for the modification of an existing structural gone or alternately tho de novo synthesis of a 
structural gene which method is somewhat less riqomus than the method first described above Hotornng to 
Figure l. the soioctod ONA sequence is scanned to idontify regions with greater than four consecutive 
J 5 adenine (A) or thymine (T) nucleotides. The A *■ T regions are scanned for potential plant polyadenylation 
signals. Although the absence of five or more consecutive A or T nucleotides eliminates most plant 
polyadenylation signals, if there are more than one of the minor polyadenylation signals idonti'tod within ton 
nucleotides of each other, then the nucleotide sequenco ot this reg (v .i is preferably altered to remove those 
signals while maintaining the original encoded ammo acid sequenco 
40 The second step is to consider the 15 to 30 nucleotide regions surrounding the A+ T rich region 

identified m step one. If the A + T content of the surrounding region *s less than 80V tho region should be 
examined for polyadenylation signals. Alteration of the rogton based on polyadenylation signals is depen¬ 
dent upon < 1 > the number of polyadenylation signals present and (2) presence of a major plant polyadonyia¬ 
tion signal. 

45 The extended region is examined for the presence of plant polyadenylation signals Tho polyadonyia¬ 

tion signals are removed by site-directed mutagenesis of the ONA sequence. The extended region is also 
examined for multiple copies of the ATTTA sequenco which are also removed by mutagenosis 

It is also preferred that regions comprising many consecutive A + T bases or G *C bases are disrupted 
since these regions are predicted to have a highnr likelihood to form hairpin structure duo to soif- 
so complementarity Therefore, insertion of heterogonoous base pairs would reduce the likelihood of seif- 
complementary secondary structure formation which aro known to inhibit transcription and or translation m 
some organisms in most cases, the adverse effocts may be minimized by using sequences *hich do not 
contain more than five consecutive A ♦ T or G + C 
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The oligonucleotides used in the mutagenesis are designed to maintain the proper amino acid 
sequence and reading frame and preferably to not introduce common restriction sites such as Bglll. Hindlll. 
Sacl, Kpnl. EcoRI. Ncol. Pstl and Sail into the modified gene. These restriction sites are found in multi-linker 
insertion sites of cloning vectors such as plasmids pUCH8 and pMON7258. Of course, the introduction of 
s new polyadenylation signals. ATTTA sequences or consecutive stretches of more than five A ♦ T or G ♦ C. 
should also be avoided. The preferred size for the oligonucleotides is around 40-50 bases, but fragments 
ranging from" 18 to 100 bases have been utilized. In most cases, a minimum of 5 to 8 base pairs of 
homology to the template ONA on both ends of the synthesized fragment are maintained to insure proper 
hybridization of the primer to the template. The oligonucleotides should avoid sequences longer than five 
to base pairs A + T or G + C. Codons used in the replacement of wild-type codons should preferably avoid the 
TA or CG doublet wherever possible. Codons are selected from a plant preferred codon table (ruch as 
Table I below) so as to avoid codons which are rarely found in plant genomes, and efforts should be made 
to select codons to preferably adjust the G +C content to about 50%. 
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Table I 


Codon 


Percent Usage 
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Table I - continued 


Preferred Codon Usage in Plant-g 


5 

Amino Acid 

Codon 

GGA 

Percent Usage 
in Plant‘d 


GLY 

32 

to 


GGC 

20 



GGG 

11 



GGU 

37 


ILE 

AUA 

12 

/3 


AUC 

45 



AUU 

43 


VAL 

GUA 

9 



GUC 

20 

20 


GUG 

28 



GUU 

43 


LYS 

AAA 

36 



AAG 

64 

25 

A SN 

AAC 

72 



AAU 

28 


GLN 

CAA 

64 

30 


CAG 

36 


HIS 

CAC 

65 



CAU 

35 

75 

GLU 

GAA 

48 



GAG 

52 


ASP 

GAC 

48 



GACJ 

52 

40 

TYR 

UAC 

68 



UAU 

32 


CYS 

UGC 

78 

45 


(JGU 

22 


50 


i 

i 

i 


j 

i 

! 
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Table 1 - continued 


Amino Ar<H 


Codon 


Percent Usage 
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be used as described herein. For purposes of this description, the phrase "CaMV35S* promoter thus 
includes variations of CaMV35S promoter, e g., promoters derived by means of ligation with operator 
regions, random or controlled mutagenesis, etc. Furthermore, the promoters may be altered to contain 
multiple "enhancer sequences" to assist in elevating gene expression. 

5 The RNA produced by a ONA construct of the present invention also contains a 5 non-translated leader 
sequence. This sequence can be derived from the promoter selected to express the gene, and can be 
specifically modified so as to increase translation of the mRNA. The 5 non-translated regions can also be 
obtained from viral RNA’s. from suitable eukaryotic genes, or from a synthetic gene sequence. The present 
invention is not limited to constructs, as presented in the following examples. Rather, the non-translated 
iq leader sequence can be part of the 5 end of the non-translated region of the coding sequence for the virus 
coat protein, or part of the promoter sequence, or can be derived from an unrelated promoter or coding 
sequence, in any case, it is preferred that the sequence flanking the initiation site conform ;o the 
translational consensus sequence rules for enhanced translation initiation reported by Kozak (1984). 

The ONA construct of the present invention also contains a modified or fully-synthetic structural coding 
is sequence which has been changed to enhance the performance of the gene in plants. In a particular 
embodiment of the present invention (he enhancement method has been appliod to design modified and 
fully synthetic genes encoding the crystal toxin protein of Bacillus thuringiensis. The structural genes of 
the present invention may optionally encode a fusion protein comprising .in ammo-torminal chloroplast 
transit peptide or secretory signal sequence (see lor instance. Examples 10 and t i). 

20 The DNA construct also contains a 3 non-translated region. The 3 non-translated rogion contains a 
polyadenylation signal which functions m plants to cause the addition of poiyadonyfate nuciootirios to the 3 
end of the vtral RNA. Examples of suitable 3 regions are (1) the 3 transcribed, non-translated regions 
containing the polyadenylation signal oi Agrobacterium tumor-inducing (Tl) plasmid genes, such as the 
nopaline synthase (NOS) gene, and (2) plant genes like the soybean storago protom (7S) gonos and the 
2 $ small subunit of the RuBP carboxylase (E9) gene. An example of a preferred 3 tog'on is that from the 'S 
gene, described in greater detail in the examples below. 

Plant Transformation 
JO 

A chimeric plant gene containing a structural coding sequence of the pmsent invention can be inserted 
mto the genome of a plant by any suitablo method Suitablo plants for use m ;ho puctico of the present 
invention include, but are not limited to. soybean, cotton, alfalfa, misnod rape. Max, tomato, sugarboet. 
sunflower, potato, tobacco, maize, rice and wheat. Suitable plant transformation vectors include those 
jb derived from a Ti piasmid of Agrobacterium tumefaciens, as well as those disclosed, eg. by Herrera- 
Estrella (1983). Bevan (1983). Klee (1985) ana EPO publication 120.516 d>*.hiipcrood et ai ) in addition to 
plant transformation vectors derived from the Ti or root-inducing (Ri) ptasrnids of Agrobacterium, alter¬ 
native methods can be used to insert the DNA constructs Of this invention mto plant coils Seen methods 
may involve, for example, the use of liposomes, electroporation, chemicals that increase M^e ONA uptake. 
jo free DNA delivery via microprojectile bombardment, and transformation using viruses or pollen 

A particularly useful Ti plasmid cassette vector for transformation of dicotyledonous plants is shown m 
Figure 5. Referring to Figure 5, the expression cassette pMON893 consists of the enhanced CaMV35S 
promoter (EN. 35S) and the 3 end including polyadenylation signals from a soybean gene encoding the 
alpha-prime subunit of beta-conglycinin. Between these two elements is a muitiimker containing multiple 
45 restriction sites for the insertion of genes. 

The enhanced CaMV35S promoter was constructed as follows A fragment of the CaMV35S promoter 
extending between position -343 and + 9 was previously constructed m pUCi3 by Odoii et ai 0 985) This 
segment contains a region identified by Odell et a!. (1985) as being necossnry for maximal expression of 
the CaMV35S promoter, ft was excised as a Clal-Hindlll fragment, made blunt ended with DNA polymerase 
50 I (Klenow fragment) and inserted into the Hincll Site of pUCl8 Tins upstream region of the 35S promoter 
was excised from this plasmid as a Hindlll- EcoRV fragment (extending from -3*13 to -90) and inserted mto 
the same plasmid between the Hindlll and Pstl sites. The enhanced CaMV/3'jS promoter thus contains a 
duplication of sequences between *343 and *90 (Kay et at.. 1987) 

The 3 end of the 7S gene is derived from the 7S gone contained on the clono designated 17 i 
55 (Schuler ot al.. 1982). This 3 end fragment, which includes thn polyadonyiation signals, extends from an 
Avail site located about 30 bp upstream of the termination codon lor the betn-congiycimh gene m clone 
17.1 to an EcoRI site located about 450 bp downstream of this toonmation codon. 

The remainder of pMON893 contains a segment ot pBR322 which provides an nngin of replication m E 
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coll and a region for homologous recombination with the disarmed T-ONA in Aarobactarium 
(described below); the oriV region from the broad host range plasmid rki- the l , 00 

resistance gene from Tn7; and a chimeric NPTII gene, containing the CaMV35S 
synthase (NOS, 3 end. which provide, kanamycin resistance in 
s Referring to figure 6. transformation vector plasmid pMON900 is a derivative of oMOnam th« 

“Sttrsrizzr ** sssus 

p omoter (veiten et al. 1984). The other segments are the same as plasmid pMON893 After inrorrwatinn 
of a ONA construct into plasmid vector pMON893 or pMONMO. the intermediate vector is introduSd into 
A. tumefaciens strain ACO which contain, a disarmed Ti plasmid. Cointegrate ToM mZI 1 
10 selected and used to transform dicotyledonous plants. 9 p d sectors are 

a r?: e Q f £ 2 9 i° F ' gun 7 ’ A - ACO is a disarmed strain similar to pTiBSSE described by Fralev 

t al. (1985). For construction of ACO the Parting Agrobacterium strain was the strain A208 which contains 

alTtMsr.o’ThT' p,a,mid 1 Th# 11 p,asm 1 was disarmed in a manner similar to that described by Fraley et 
' * 80 tha assentially all of the native T-ONA was removed except for the left border and a lew 

r un red base pairs of T-ONA inside the left border. The remainder of the T-ONA extondina to a oomt iust 

b SoT.?* ,he n9ht b0rder W8S replace<1 wi,h a n ® v «i piece of DNA in-iuding (from left to riaht) a seoment ol 
and oriV'seornems 9 ! 00 ,r0m plasmid RK2 - and the kanamycin resistance gene from Tn60l. The pBR322 
comCeSZon." 6 S, "’ ,,a ' ,0 ^ Segm °" ,S " PM ° N893 ** P - d " a '•*" C homoSgy -or 

" . I**. ,0,,0wing ® xamp,9S afe Prided to better elucidate the practice ol the present ,nvon.,on and should 

no. be interpreted m any way to lim.t the scope of ,he present invention. ?hose skZn,he TOtf 

h1^r. n,Z hi 3, ,T 0US m0d ' ,ICa "° nS ' ,runca,l0ns e'c can be made to the methods and genes described 
herein while not departing from the spirit and scope of the prosent mvontion. 

75 

Example ^ » Modified d.tk. HO* I Gene 

Relemng to Figure 2. the w.ld-type ftf.fr. HD-. gone .s known to be oppressed poorly ,n plants as a lull 
length gene or as a truncated gene The G *C comen, o. the 8r*. gene .s low ™ 

numerous “ 8 S °° r «" " ».« -.- 

rnbirt i| 


List *jf ottpuenCOS Of 
the Potontia! 
PolyaUenylation 
Signals 


AATAAA* 

AATAAP 

AACCAA 

ATATAA 

AATCAA 

ATACTA 

ATAAAA 

ATGAAA 


AAGCA T 

ATTAAT 

ATACAT 

AAAATA 

ATTAAA" 

AATTAA- 

AATACA" 

CATAAA" 


■ indicates a potential inaior plant polyadonyintion site 
” ,n <J*catos a potential mmor animal polynPnMviatinri 
All others are potential 
minor plant 
polyadenvlation sites 


Ol thelufr hq!*, geno Vn,he " C 0li90nucle °" des an ' 1 in, the s .tn-.l,,.vte,t -THilag.-osis 
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Table III 



Mutacteneaia Primera 

for R.L.Jc. HD 

-1 Gene 

grimec 

Lenath <bol 

Sequence 


BTK185 

18 

TCCCCAGATA 

ATATCAAC 

BTK240 

48 

GGCTTGATTC 

CTAGCGAACT 



CTTCGATTCT 

CTGGTTGATG 



AGCTGTTC 


BTK462 

54 

CAAAACTGAG 

AGGTGGAGGT 



TGGCAGCTTG 

AACGTACACG 



GAGAGGAGAGGAAC 

BTK669 

48 

AGTTAGTGTA 

AGCTCTCTTC 



TGAACTGGTT 

GTACCTGATC 



CAATCTCT 


BTK930 

19 

AGCCATGATC 

TGGTGACCGG 



ACCAGTAGTA 

TTCTCCTCT 

BTK1110 

32 

AGTTGTTGGT 

TGTTGATCCC 


GATGTT.’JUA GG 


m 
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Table III - continued 




Lenorh f|-|p} Seauanro 


BTK1330A 


GTGATGAAGG GATGATGTTG 
TTGAACTCAG CACTACG 


BTK1380T 


CAGAAGTTCC AGAGCCAAGA 
TTAGTAGACT TGGTGAGTGG 
GATTTGGGTG ATTTGTGATG 
AAGGGATGAT GTTGTTGAAC 
TCAGCACTAC GATGTATCCA 


BTK1600 


TGATGTGTGG AACTGAAGGT 
TTGTGGT 


»» rrjx sssyr-sr m “ ,os j » 

* rrss: 

rrr»„ir o - ~~ >•-» - — -' - ,:: 

tho n tk r0 ^ ,OnS f ° r muta Q" n osts wore selected m mo following manner. All 'cqions 'ho ON* f 

’ z::z r: rrr r o> ,no,s « »“ - *-? 

(see Table II above! or ATTTA seauence« 0r ^ l0ns wh,ch m, g h t coniam pniyadnnyiation s<ios 

lion Of AT consecoL rloion, h K 0 "5° nucleo " dos "*• *»'S>ned which mailed .ho 
Two potential plant polyad^ylation s^worT^JT ^ m ° f ° po ' vi *' 1en '' la " on s "« ° r ArrrA so,fences 
Codons were selected which increased G *C contonTnvi C, '" Cal (Se ® Tab,e H) basod 00 Polish.,.- -oports. 

ir 

oi iwc, hoo.o.o, v ,o nafc7 s i?™v, :r. 3 j i r, M ” o * ,s * m c ° ra “ M •» *>»»» 

m site-directed mutagenesis reactions Fn o ° 8 rngmonIS for efficient hybridization and priming 

mutagenesis reactions. Figure 2 compares the witd-ivoe B t k Hn i „„„„ «. _ ^ 

sequence which resulted from the modifications by site-directed mutagen!sis‘ ' ^ ^ 

terminus are as .Us ^ mu,a<)e " es ’ s Cha '^ 0S h “" -"° (5) terminus ,0 ,ho c.„bo,y ,3 , 

^"V 30 ,8 ’ m0r USOd ,0 °"' n,na,e a Dlanl PO'V.ldonyl.,„on site .n the midst of a n,„o haso pan 

iS Hr ™ — -nr 

0TK462 » a 54-™, inlIoaucins| „ „„„ „„ changss , 1|S1 m cnanges ^ r 
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richness of the gene by replacing wild-type codons with codons containing G and C while avoiding the CG 
doublet. The next seven changes made by BTK462 were used to eliminate an A*T rich region (13 of 14 
base pairs were A or T) containing two ATTTA regions. 

BTK669 is a 48-mer making nine individual base pair changes eliminating three possible poiyadenyla- 
5 tion sites (ATATAA, AATCAA. and AATTAA) and a single ATTTA site. 

8TK930 is a 39-mer designed to increase the G + C content and to eliminate a potential polyadenylation 
site (AATAAT - a major site). This region did contain a nine base pair region of consecutive A.T sequence. 
One of the base pair changes was a G to A because a G at this position would have created a G ♦ C rich 
region (CCGG(G)C). Since sequencing reactions indicate that there can be difficulties generating sequence 
/o through G ♦ C consecutive bases, it was thought to be prudent to avoid generating potentially problematic 
regions even if they were problematic only in vitro. 

BTK1110 is a 32-mer designed to introduce five changes in the wild-type gene. One potential cite 
(AATAAT • a major site) was eliminated in the midst of an A ♦T rich region (19 of 22 base pairs). 

8TK1380A and BTK1380T are responsible for 14 individual base pair changes. The first region (1380A) 
;s has 17 consecutive A + T base pairs. In this region is an ATTTA and a potential polyadenylation sito 
(AATAAT). The 100-mer (1380T) contains all the changes dictated by 1380A. The large size of this primer 
was in part an experiment to determine if it was feasible to utilize large oligonucleotides (or mutagenesis 
(over 60 bases in length). A second consideration was that the 100-mer was used to mutagemze a template 
which id previously been mutageneized by 1380A. The original primer ordered to mulagemze the region 
downstream and adjacent to 13Q0A did not anneal efficiently to the desired site as indicated by an inability 
to obtain clean sequence utilizing the primer. The large region of homology of 1380T did assure proper 
annealing. The extended size of 1380T was more of a convenience rather than a necessity The second 
region adjacent to 1380A covered by 1380T has a high A + T content (22 of 29 bases are A or T) 

BTK1600 is a 27-mer responsible for five individual base pair changes. An ATTTA region and a plant 
is polyadenylation site wore identified and the appropriate changes engineered. 

A total of 62 bases were changed by site-directed mutagenesis. The G*C contont increased by 55 
base pairs, the potential polyadenylation sites were reduced from 10 to seven and the ATTTA sequences 
decreased from t3 to seven. The changes in the ONA sequenco losulted m changes .n 55 of the 579 
cudons m the truncated 8.t.k. gone in pMON5342 (approximately 9 5°*) 
jo Referring to Table IV modified B.t.k. HO-1 genes were constructed that contained all ol tho above 
modifications (pMON5370) or various subsots of :ndiv*dual modifications. Those genes woro inserted into 
pMON093 for plant transformation and tobacco plants containing those genes wore analyzed The analysis 
of tobacco plants with the individual modifications was undertaken for several reasons. Expression of the 
wild typo truncatod gone m tobacco ;s very poor, resulting m infrequent identification of plants to*ic to 
is THW Toxicity is defined by leaf feeding assays as at least 60% mortality of tobacco hornworm neonate 
larvae with a damage rating of I or less (scale >s 0 to 4: 0 is equivalent to total protection. 4 total damage) 
Tho modified HD-i gene (pMON5370) shows a large increase m expression (estimated to bo approximately 
1 00*foid: see Table VIII) m tobacco. Therefore, increases m expression of the wild-type gene due to 
individual modifications would be apparently a large increase m the frequency of toxic tobacco plants and 
•*o the presence of detectable 8.t.k. protein. Results are shown tn the following table 


fable IV 


Relative effects of Regional Modifications within the Q.t.h. Gone 

Construct 

Position Modified 

» Of 

Plants 

* Of Toxic 

Plants 

PMON5370 

185. 240. 669.930. PIC. I380a*t>. 1600 

38 

22 

PMON10707 

185. 240. 462. 669 

48 

19 

OMON10706 

930. mO. 1380a * b. '000 

43 

1 

pMON 10539 

105 

55 

•> 

pMON 10537 

240 

57 

i; 

PMON10540 

185. 240 

88 

33 

pMON 10705 

402 

47 

1 


The effects of each individual oligonucleotides' changos on expression o:o reveal some overall trends 
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Six different constructs were generated which were designed to identify the key regions. The nine different 
oligonucleotides were divided in half by their position on the gene. Changes in the N-terminal half were 
incorporated into pMONl0707 (185,240. 462.669). C-terminal half changes were incorporated into 
pMON10708 (930,1110,1380a+ b. 1600). The results of analysts of plants with these two constructs indicate 
s that pMON 10707 produces a substantial number of toxic plants (19 of 48). Protein from these plants is 
detectable by ELISA analysis. pMON 10706 plants were rarely identified as insecticidal (1 of 43) and the 
levels of 8 .tX were barely detectable by immunological analysis. Investigation of the N-terminal changes m 
greater detail was done with 4 pMON constructs; 10539 (185 alone), 10537 (240 alone). 10540 (185 and 
240) and 10705 (462 alone). The results indicate that the presence of the changes it; 240 were required to 

io generate a substantial number of toxic plants (pMON 10540; 23 of 88. pMON 10537; 17 of 57). The absence 

of the 240 changes resulted in a low frequency of toxic plants with low B.t.k. protein levels, identical to 
results with the wild type gene. These results indicate that the changes in 240 are responsible for a 
substantial increase in B.t.k. expression levels over an analogous wild-type construct in tobacco. Changos <n 
additional regions (185.462.669) in conjunction with 240 may result in increases m B.t.k. expression c*2 
;5 fold). However, changes at the 240 region of the N-terminal portion of the gene do result m dramatic 
increases in expression. 

Oespite the importance of the alteration of the 240 region in expression of modified gones. increased 
expression can be achieved by alteration of other regions. Hybrid genes, part wild-type, part synthetic, wore 
generated to determine the effects of synthetic gene segments on the levols ol B.t.k. exprossion. A hytmd 
20 gene was generated with a synthetic N-termmal third (base pair 1 to 590 ol Figure 2: to the Xbal sito) with 

the C-terminal wild type B.t.k, HO-1 (pMON5378) Plants transformed with this voctor were as toxic as plants 
transformed with the modified HD-1 gene (pMON5370). This is consistent with tho alteration of the -M0 
region. However. pMON 10538. a hybrid with a wild-type N-termmal third (wild type geno for the first 000 
base pairs, to the second Xbal site) and a synthetic C*terminal last two-thirds (base pair 590 to 1845 of 
Figure 3 was used to transform tobacco and resulted in a dramatic mcroaso m expression. The levels of 
expression do not appear to be as high as those seen with the synthetic gene, but aro comparable to the 
modified geno levels. These results indicate that modification of the 240 segment is not essential to 
increased expression since pMON 10538 has an intact 240 region A fully synthetic gone is. m most cases, 
superior for expression levels of B.t.k. (See Examplo 2 ) 

JO 


Example 2 •• Fully Synthetic B.t.k. HO-1 Gene 


A svnthotiC B t.k. HO-1 gene was de c, gni»d using the preferred plant r.idnnx l isted m f.ihu* ',/ •'•now 
j 5 fable V lists tho codons and frequency of use m plant genes of dicotyledonous plants ccmparod to me 
frequency of their use m the wild type B.tk HD-t gone (amino acids t-615) and mo synthetic germ this 
example The total number of each ammo acid m this soymont of tho gene is 'ismd m the parenthesis un«vr 
the ammo aod designated 
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Table V 


Codon in Usage Synthetic fl .g.Jr. Hn-l Gen» 



Amino Acid 

Codon 

Percent Usage 

Piancs/Wt a. t. 

in 

/ 

to 

ARG 

CGA 

7 

11 

2 


(43) 

CGC 

11 

5 • 

5 

r S 


CGG 

5 

2 

0 



CGU 

25 

14 

27 



AGA 

29 

55 

41 

:o 


AGG 

23 

14 

25 


LEU 

CUA 

8 

16 

4 


(49) 

cue 

20 

0 

20 



CUG 

10 

2 

6 



CUU 

28 

22 

24 



UUA 

5 

50 

0 

to 


DUG 

30 

10 

45 


SER 

UCA 

14 


5 

:S 

(64) 

UCC 

26 

9 

28 



UCG 

3 

8 

0 



UCU 

21 

1? 

31 



AGC 

21 

£ 

32 

JO 


AGU 

15 

31 

5 


'5 



19 
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Table V - continued 


ftmlno ftcl 

Codon 

Percent 

P I'anta/V 

Usage 

ft-.fl.e. 

in 

. k . / 

THR 

ACA 

21 

31 

14 

(42) 

ACC 

41 

19 

53 


ACG 

7 

14 

0 


ACU 

31 

36 

33 

PRO 

CCA 

4S 

35 

53 

(34) 

CCC 

19 

6 

12 


CCG 

9 

21 

3 


CCO 

26 

38 

32 

ALA 

GCA 

23 

38 

26 

(31) 

GCC 

32 

9 

2 9 


GCG 

3 

3 

c 


GCU 

41 

50 

•15 

GLT 

GGA 

32 

52 

*;5 

(46) 

GGC 

20 

17 

15 


GGG 

11 

15 

6 


GGU 

37 

15 

J 4 

ILE 

AUA 

12 

39 

-N 

(46) 

AUC 

45 

11 

<5 7 


AUU 

43 

50 

JO 


/AS 


20 


Table V - continued 


Codon in Usage Synthetic B .t.k. HD-l 


Amino Acid 

Codon 

Percent Usage 
Plants/Wt fl-f. 

in 

k /Syr. 

VAL 

GUA 

9 

45 

3 

(38) 

GUC 

20 

5 

16 


GUG 

28 

U 

37 


GUU 

43 

39 

45 

LYS 

AAA 

36 

LOO 

33 

(3) 

AAG 

64 

0 

67 

ASN 

AAC 

72 

27 

80 

(44) 

AAG 

23 

73 

20 

GLN 

CAA 

64 

77 

61 

(31) 

CAG 

36 

23 

39 

HIS 

CAC 

65 

C 

80 

(10) 

CAG 

35 

100 

20 

GLG 

GAA 

48 

87 

50 

(30) 

GAG 

52 

13 

50 

AS? 

GAC 

4 8 

17 

6 5 

(23) 

GAG 

52 

83 

35 


Gene 



1 
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Tabla V - continued 


Codon in Uaa a» Synthetic fl.&.fr. HD-1 Gene 
s 


Percent Usage in 

Amino Acid Codon P-lanes/Wt B.e.Jc./Svti 


10 


15 


20 


25 


10 


J5 


■*0 


TYR 

UAC 

68 

20 

72 

(25) 

UAU 

32 

80 

28 

CYS 

UGC 

78 

50 

100 

(2) 

UGU 

22 

50 

0 

PHE 

UUC 

56 

17 

83 

(36) 

UUU 

44 

83 

17 

MET 

AUG 

100 

100 

100 

(9) 





TRP 

UGG 

100 

100 

ICC 


(9) 

The resulting synthetic gene lacks ATTTA sequences, contains only ono potential poiyadony'atmn Mto 
and has a G ♦ C content ol 48 5% Figure 3 <s a comparison of tho wild-type HD-i sequence to the 
synthetic gone sequence for ammo acids i -615 There -s approximately 77% DNA homology between tho 
synthetic gene and the wild-type gene and 356 of the 615 codons have boon changed (appimr.matoiv 
60%). 


Example 3 -- Synthetic Q.t.k. HD-73 Gene 

The crystal protein toxin from dt.k. HO-73 oxhibits a higher unit activity against some important 
■*5 agricultural pests. The toxin protein ol HD-1 and HO-73 exhibit substantial homology (-00%) m the N* 
terminal 450 amino acids, but differ substantially m the ammo acid region 4$i-fit5 Fusion pmm.c<; 
comprising amino acids 1-450 of HD-1 and 451-615 of HO-73 exhibit the insecticidal properties of the *iid- 
type HD-73. The strategy employed was to use tho 5 -two thirds of the synthetic HO-i gone (lust 1350 
bases, up to the Sacl site) and to dramatically modify the final 590 bases (through ammo acid C45) of the 
so HO-73 m a manner consistent with the algorithm used :o design the synthetic HO-i gene Table vi below 
lists the oligonucleotides used to modify the HD-73 gone m the order used m the gene from S to 3 nmj 
Nine oligonucleotides were used m a 590 base pair region, each nucleohdo ranging m sice hum 33 to 60 
bases. The only regions left unchanged were areas where there wore no long consecutive strings of A or t 
bases (longer than six). All polyadenylation sites and ATTTA sites were eliminated 
55 


/Sip A 


22 
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Table VI 




Mutagenesis 

Primers. 

for_B.fc.V. 

HD-73 

5 

Primer 

Length 

(bn) 

Sequence 


;o 

73K1363 

51 


AATACTATCG 

GATGCGATGA 





TGTTGTTGAA 

CTCAGCACTA 





CGGTGTATCC 

A 

>5 

73K1437 

33 


TCCTGAAATG 

ACAGAACCGT 





TGAAGAGAAA 

GTT 

20 

73K1471 

48 


ATTTCCACTG 

CTGTTGAGTC 





TAACGAGGTC 

TCCACCAGTG 





AATCCTGG 


25 

73K1561 

60 


GTGAATAGGG 

GTCACAGAAG 




■ 

CATACCTCAC 

ACGAACTCTA 

JO 




TATCTGGTAG 

ATGTTGGATG' 


73K1642 

33 


TGTAGCTGGA 

ACTGT ATTG'o 

J5 




AGAAGATGGA 

TGA 


73K1675 

48 


TTCAAAGTAA 

CCGAAATCGC 





TGGATTGGAG 

A7TATCCAAG 

JO 




GAGGTAGC 



7 3K17 41 

39 


ACTAAAGTTT 

CTAACACCCA 

45 




CGATGTTACr 

GAGTGAAGA 


50 


55 


/J7 
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Table VI - continued 




Mutagenesis Primers 

for B.t.k. 

HD-73 

5 

Primmv 


Sequence 


10 

73K1797 

36 

AACTGGAATG 

AACTCGAATC 




TGTCGATAAT 

CACTCC 

15 

73KTERM 

54 

GGACACTAGA 

TCTTAGTGAT 




AATCGGTCAC 

ATTTGTCTTG 




AGTCCAAGCT 

GGTT 


20 

The resulting gene has two potential polyadenylation sites (compared to 10 m the WT) and no ATTTA 
sequence (12 in the WT). The G +C content has increased from 37% to 48% A total ol 59 individual base 
pair changes were made using the primers in Table Vt. Overall, there is 90% DNA homology between the 
region of the HD-73 gene modified by site directed mutagenesis and the wild-type sequence of the 
2 s analogous region of HO-73. The synthetic HD-73 is a hybrid of the first 1360 bases from the synthetic HD-1 
and the next 590 bases or so modified HO-73 sequence. Figure 4 is a comparison of the above-described 
synthetic 8.t.k. HD-73 and the wild-type 9.t.k. HD-73 encoding ammo acids 1*645 in the modified region of 
the HD-73 gene 44 of the 170 codons (25%) were changed as a result of the sito-directod mutagenesis 
changes resulting from the oligonucleotides found in Table VI Overall, approximately 50% of the codons m 
jo the synthetic 8.t.k. HD-73 differ from the analogous segment of the wild-typo and HD-73 gone 

A one base pair deletion m the synthetic HD-73 gene was detected m the course of sequencing the J 
end at base pair 1890. This results m a frame-shift mutation at ammo acid 625 with a premature stop codon 
at ammo acid 640 (pMON5379). Table VII below compares the codon usaqo of the wild-type gone of Q.t k 
HO-/J versus the synthetic gene ol this example for ammo acids 451-645 and codon usage ol naturally 
j 5 occurring gonos of dicotyledonous plants The total number of each ammo aod nncodnd m this segment 4 
the gene is found m the parentheses unaor the ammo acid designation 


Table VII 

40 Codon Usage in Synt-hetic 3-C.lr. HD-~n G*ne 

'Percent Usage in 


Amino Acid 

Codon 

P 1 artts/Wt HD- 

7 3.' Svn 

ARG 

CGA 

7 

10 

0 

(10) 

CGC 

11 

0 

3 


CGG 

5 

10 

0 


CGU 

25 

20 

23 


AGA 

29 

60 

62 


AGG 

23 

0 

9 


24 
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Table VII - continued 

Codon Usage in Synthetic B.t.Jc. HD-73 G»n» 


Percent Osage in 


Amino Acid 

Codon 

Planta/Mt. HD-73/Svi 

LEO 

COA 

8 25 8 


10 

(12) 

COC 

20 

17 

58 



COG 

10 

17 

8 



COO 

28 

8 

0 

15 


OOA 

5 

33 

8 


UUG 

30 

0 

17 


SER 

UCA 

14 

24 

18 

20 

(21) 

OCC 

26 

10 

27 



UCG 

3 

10 

0 



OCO 

21 

24 

18 

25 


AGC 

21 

0 

14 



AGO 

15 

33 

23 

JO 

THR 

ACA 

21 

47 

38 


(15) 

ACC 

4 l 

13 

31 



ACG 

-T 

/ 

13 

0 

VS 


ACU 

31 

27 

31 


PRO 

CCA 

45 

71 

71 


(7) 

CCC 

19 

A 

0 

40 


CCG 

9 

14 

0 



ecu 

26 

14 

29 


, / 3 ? 
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Table VII - continued 


Percent Usage in 


Amino Ar 

id Codon 

Plantar 

'Wfc HP-" 

LlZSj 

GLN 

CAA 

64 

60 

67 

(5) 

CAG 

36 

40 

33 

HIS 

CAC 

65 

67 

100 

(3) 

CAU 

35 

33 

0 

GLU 

GAA 

48 

86 

57 

(7) 

GAG 

S2 

14 

43 

ASP 

GAC 

48 

40 

50 

(5) 

GAU 

52 

60 

50 

TYR 

UAC 

68 

0 

20 

(5) 

UAU 

32 

100 

80 

CYS 

UGC 

79 

0 

0 

(0) 

UGU 

22 

0 

0 

• 

PKE 

UUC 

56 

8 

67 

(13) 

UUU 

44 

92 

33 

MET 

AUG 

100 

100 

100 

(2) 





TRP 

UGG 

100 

100 

100 


Another truncated synthetic HD-73 gene was constructed. The soquenco of this synthotic HD* 73 goon 
50 is identical to that of the above synthetic HD-73 gene m the region in which they overlap (ammo aods ^9- 
615). and it also encodes Met-Ala at the N-tniminus. Figure 8 shows a comparison of this truncated 
synthetic HD-73 gone with the N-terminai Met-Ala versus the wild-type HD-73 gone 

While the previous examples have been directed at the preparation ol synthetic and modified genes 
encoding truncated B.t.k. proteins, synthetic or modified genes can also bo prepared which oncodo full 
55 length toxin proteins. 

One full length B.t.k. gene consists of the synthetic HD-73 sequence of Figure 4 from nuciootido t-U}45 
plus wild-type HD-73 sequence encoding ammo acids 616 to the C-termmus ol the native protom. Figuro 9 
shows a comparison of this synthetic'wiid-type full length HD-73 gene versus the wiig-type lull length HD* 73 


/¥/ 
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gene. 

Another full length Q.t.k. gene consists of the synthetic HD-73 sequence of Figure 4 from nucleotide 1- 
1845 plus a modified HD-73 sequence ending amino acids 616 to the C*terminus of the native protein. The 
C-terminal portion has been modified by site-directed mutagenesis to remove putative polyadenylation 
5 signals and ATTTA sequences according to the algorithm of Figure 1. Figure 10 shows a comparison of this 
synthetic/modified full length HD-73 gene versus the wild-type full length HD-73 gene. 

Another full length 8J.k t gene consists of a fully synthetic HD-73 seqiance which incorporates the 
synthetic HD-73 sequence of Figure 4 from nucleotide 1-1845 plus a synthetic soquence encoding amino 
acids 616 to the C-terminus of the native protein. The C-terminal synthetic portion has been designed to 
to eliminate putative polyadenylation signals and ATTTA sequences and to include plant preferred codons. 
Figure 11 shows a comparison of this fully synthetic full length HD-73 gene versus the wild-type full length 
HD-73 gene. 

Alternatively, another full length QA.k. gene consists of a fully synthetic sequence comprising base pairs 
1-1830 of BAX HD-1 (Figure 3) and base pairs t834-3534 of QA.k. HO-73 (Figure 11). 

*3 


Example 4 - Expression of Modified and Synthetic BJ.k HD-i and Synthetic HO-73 

A number of plant transformation vectors for the expression of B t.k. genes were constructed by 
20 incorporating the structural coding sequences of the previously described genes into plant transformation 
cassette vector pMON893. The respective intermediate transformation vector is inserted into a suitable 
disarmed Agrobacterium vector such as A. tumetaciens AGO. supra. Tissue explants are cocultured with 
the disarmed Agrobacterium vector and plants regenerated undor selection for kanamyon resistance using 
known protocols: tobacco (Hursch et al.. 1985); tomato (McCormick et al.. 1986) and cotton (Trolinder et al.. 
20 1987). 


a) Tobacco 

70 The level Of Q.t.k. HD-1 protom m transgenic tohao'n plants containing pMON‘)92i *wtM t/pn 

truncated). pMON5370 (modified HD-i. Example 1. Figure 2) and pMOrvr>J77 (synthe;.c HD-i. t*ampio 2. 
Figure 3) were analysed by Western analysis. Loal tissue was frozon m liquid nitrogen, ground to a hoe 
powdor and then ground :n a i 2 (wtivoluino) of SOS-PAGE sample buffer Samples were (uAun w .. di / tee. 
then incubated for 10 minutes in a boiling water bath and rmorofugod for 10 mmutos The protein 
> r > concentration of tho supernatant was determined hy the method of Bradford (Ana). Biochem. ?2 248-254) 
Fifty ug of protein was run per lane on 9% SDS-PAGE gels, the protein transferred to nitrocellulose and the 
Q.t.k. HO-1 protein visualized using antibodies produced agamst Q t k HD-i protein as the primary antibody 
and alkaline phosphatase conjugated second antibody as described by the manufacturer tPromoga. 
Madison. Wl) Purified HD-i tryptic fragment was used as the control Whereas tho B t.k protein from 
w tobacco plants containing pMON9921 was below tho level of detection, the Q.t.k. protein trom plants 
containing the modified (pMON5370) and synthetic (pMON5377) genes was easily detnctod The 8 t.k. 
protein from plants containing pMON992l remained undetectable, even with 10 fold longer incubation 
times. The relative levels of Q.t.k. HD-i protein m these plants is estimated m Table Vtll Bocauro tho 
protein from plants containing pMON9921 was not observed, the level of protein m these plants was 
j 5 estimated from the relative mRNA levels (see beiowj Plants containing the modified gene (pMON5370) 
expressed approximately i00 fold more Qtk protein than plants containing the wild-typo gone 
(pMON9921). Plants containing the fully synthetic B.t.k. HD-i gene (pMON5377) expressed approximately 
five fold more protein than plants containing the modified gene Tho modified gen 9 contributes the majority 
of the increase m B.t.k. expression observed. Tho plants used to generate the above data are the best 
e )0 representatives irom each construct based either on a tobacco homwnrm bioassay or on data derived from 
previous Western analysts 


S5 
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Table VIII 


Expression of B.t.k. HD-i Protein in Transgenic Tobacco 

Gene 

Description 

Vector 

B.tX. Protein’ 
Concentration 

Fold Increase in 
B.tX. Expression 

Wild type 

PMON9921 

10 

1 

Modified 

PMON5370 

1000 

too 

Synthetic 

PMON5377 

5000 

500 


* B.t.k. protein concentrations are expressed in ng/rr.g of total soluble protein The 
level of B.t.k. protein for plants containing the wild type gene are estimated from 
RNA levels. 


Plants containing these genes were tested for bioactivity to determine whother the increased quantit.os 
of protein observed by Western analysis result in a corresponding increase in bioactivity Leavos from the 
same plants used for the Western data .n Table 1 were tested for bioactmty against two insects A 
M detached leaf bioassay was first done using tobacco hornworm. an extremely sensitive lop.dopteran msoct. 
Leaves from all three transgenic tobacco plants were totally protected and 100% mortality of tobacco 
hornworm observed (see Table IX below). A much less sensitive insect, beet armyworm. was then usod m 
another detached leaf bioassay. Beet armyworm is approximately 500 told less sensitive to B.t.k. HD- 1 
protein than tobacco hornworm. The ditforence in sensitivity of these two insects was determined using 
JS P un,ied H0- ' Protein in a diot incorporation assay (see below). Plants containing the w.id-type gene 
(pMON9921) showed only minimal protection against beet armyworm. whereas plants containing the 
modified gene showed almost complete protection and plants containing the fully synthetic gone were 
totally protected against beet armyworm damage. The results of these bioassays confirm the levels of B.t.k. 
HO - 1 expression observod m the Western analysis and demonstrates that the increased levels of B.t.k. HD- 
1 protein correlates with incrnasod insecticidal activity- 


fable IX 


Protection of Tobacco Plants from Tobacco Hornworm ,md 

Boot Armyworm 

Gene 

Description 

Voctor 

Tobacco 

Hornworm 
0am age* 

Bent 

Armyworm 

Damage' 

None 

None 

NL 

NL 

Wild type 

PMON9921 

0 

3 

Modified 

pMON5370 

0 

l 

Synthetic 

PMON5377 

0 

0 


* Extent of msoct damage was rated: 0. no damago. i. slight. 2. moderate. 3. 
severe; or NL. no leal lott 


The bioactivity ol the B.t.k. HD*t protein produced by these transgenic plants was further investigated 
so to more accurately quantitato the relative activities Leaf tissue Item tobacco plants containing the wild-lypo. 
modified and synthetic genos woro ground in 100 mM sodium carbonate buffer. pH io at a I 2 <wi voh ratio 
Particulate material was removed by centrifugation The supernatant was incorporated into a synthetic diet 
similar to that described by Mnrrone et al. 0985). The diet medium was prepared the day of (tie tost with 
tho plant extract solutions incorporated m place of the 20% water component. One ml ol the diet was 
55 aliquoted into 96 well plates. 

After the diet dried, one noonate tobacco budworm larva was added to each well. Sixteen insects were 
tested with each plant sample. The plants were incubated at 27 C. Alter seven (.lays, the larvae from each 
treatment were combined and weighed on an analytical balance Hie average weight per msoct was 

H5 
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calculat -d and compared to a standard curve relating 8. tX. protein concentrations to average larval weight. 
Insect weight was inversely proportional (in a logarithmic manner) to the relative increase m Q.tX. protein 
concentration. The amount of Q.tX. HD-1 protein, based on the extent of larval growth inhibition was 
determined for two different plants containing each of the three genes. The specific activity (ng of Q.t.k. HD- 
5 1 per mg of plant protein) was determined for each plant. Plants containing the modified HD-1 gene 

(PMON5370) averaged approximately 1400 ng (1200 and 1600 ng) of Q.tX. HD-1 per mg of plant extract 
protein. This value compares closely with the 1000 ng of Q.tX. HD-1 protein pe. mg of plant extract protein 
as determined by Western analysis (Table I). 9.t.k . HD-1 concentrations for the plants containing the 
synthetic HO-1 gene averaged approximately 8200 ng (7200 and 9200 ng) of B.tX . HD-1 protein per mg of 

to plant extract protein. This number compares well to the 5000 ng of HO-1 protein per mg of plant extract 

protein estimated by Western analysis. Likewise, plants containing the synthetic gene showed approxi¬ 
mately a six-fold higher specific activity than the corresponding plants containing the modified gene for 
these bioassays. In the Western analysis the ratio was approximately 10 fold, again both are m good 
agreement. The level of Q.t.k. protein in plants containing the wild-type HD-1 gene (pMON992l) was too low 
/5 to give a significant decrease in larval weight and hence was below a level that could be quantitated m this 

assay. In conclusion, the levels of B.tX. HD-1 protein determined by both the bioassays and the Western 

analysis for these plants containing the modified and synthetic genes agree, which demonstrates that the 
B.tX . HO-1 protein produced by these plants is biologically active. 

The levels of mRNA were determined in the plants containing the wild-type Q.tX. HQ-i gone 
20 (pMON9921) and the modified gene (pMON5370) to establish whether the mcroasod loveis of protein 

production result from increased transcription or translation. mRNA from plants containing the synthetic 
gone could not be analyzed directly with the same ONA probe as used for the wild-typo and modified 
genes because of the numerous changes made in the coding sequence. mRNA was isolated and hybridized 
with a single-stranded ONA pmoe homologous to approximately the 5 90 bp of the wild-type or modified 
2 H gene coding sequences. The hybrids were digested with Si nuclease and the protected probe fragments 
analyzed by gel olectrophorosis. Because the procedure used a large excess of probe and long hybridiza¬ 
tion time, the amount of protected probe is proportional to the amount of Q.tX. mRNA present m the 
sample. Two plants expressing the modified gone (pMON5370) wero found to produce up to ton-fold moro 
RNA than a plant expressing the wild-type gene (pMON992t) 
jo The mcroasod mRNA lovol from the modified gone is consistent with the result expected from the 

modifications introduced into this gene. However, this to fold increase m mRNA with the modified gene 
compared to tho wild-type gone is m contrast to the 100 fold increase m 8 t k. protom from those genes m 
tobacco plants II the two mRNAs were equally wet) translated then a 10 told increase m st.iuie mMNA 
would be expoctod to yield a 10 fold increase m protom. Tho higher mcroase m protom indicates that the 
is modifiod gene mRNA is translated at about a iQ fold higher efficiency than wild-type Hi.is. about half of 
the total effect on gone expression can be explained by changes m mRNA levels and about Malt to changes 
m translational efficiency This increase in translational efficiency 'S striking m that only about q of the 
codons have been changod m the modified gene, that is. this ottec: -s dearly not due to wholesale codon 
usage changes The increased translational efficiency could be due to changes m mRNA -nndnry 
jo structure that allect translation or to the removal of specie translational blockades duo lo specific codons 
that were changed. 

The mcroasod expression seen with the synthetic HD-i gone was also seen with a synthetic HD-73 
gone m tooacco. B.tX. HD-73 was undetected in extracts ol tobacco plants containing tho wild-typo 
truncated HD-73 gene (pMON5367). whereas Q.tX HD-73 protom was easily dotoctod m extracts from 
•is tobacco plants containing the synthetic HD-73 gene of Figure 4 (pMON5383) Approximately 1000 ng of 
8 t k. HD-73 protein was detected per mg ot total soluble plant protein 

As described m Example 3 above, the Q.tX. HD- 73 protem encodod m pMON5383 contains a small C- 
terminal oxtonsion of ammo acids not encoded m the wild-type HD- 7 3 protem Theso extra ammo acids had 
no effect on insect toxicity or on increased plant expression A second synthetic HO* 73 gene was 
so constructed as described m Example 3 (Figure 3) and used to transform tobacco fpMON5390) Analysis of 
plants containing pMON5390 showed that this gene was expressed at loveis comparable to ‘ n at of 
pMON5383 and that these plants had similar insecticidal olhcaev 

in tobacco plants the synthetic HD-i gene was expressed at approximately a 5-fold higher level 'han 
tho synthetic HD-73 gene However, this synthetic HD-73 gene still was expressed at least 100-fold bettor 
f : r » !b an wild-type HD-73 gene. The HD-73 protem is approximately 5*ioid more toxic to many insect pests 
than the HD-t protein, so both synthetic HD-t and HD-73 goner; oroide approximately comparable 
msoctiodai ellicacy m tobacco. 

The full length B.tX. HD-'3 genes described m Example 3 were also incorporated mto the plant 
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transformation vector pMON893 so that they were expressed from the En 35S promoter. The synthetic/wild* 
type full length HD-73 gene of Figure 9 was incorporated into pMON893 to create pMON 10505. The 
synthetic/modified full length HD-73 gene of Figure 10 was incorporated into pMON893 to create 
pMON10528. The fully synthetic HO-73 gene of Figure 11 was incorporated into pMON893 to create 
5 pMON10518. These vectors were used to obtain transformed tobacco plants, and the plants were analyzed 

for insecticidal efficacy and for Q.t.k. HD-73 protein levels by Western blot or ELISA immunoassay. 

Tobacco plants containing all three of these full length Q.t.k. genes produced detectable Q.t.k. protein 
and showed 100% mortality of tobacco hornworm. This result is surprising in light of previous reported 
attempts to express the full length Q.t.k. genes in transgenic plants. Vaeck et al. (1987) reported that a full 
w length Q.t.k. bQflinof gene similar to our HD-t gene could not be detectabiy expressed in tobacco. Barton 
et al. (1987) reported a similar result for another full length gene from Q.t.k. HD-1 (the so called 4.5 kb 
gene), and further indicated that tobacco callus containing this gene became necrotic, indicating that the full 
length gene product was toxic to plant cells. Fischhoff et al. (1987) reported that the full length Q.t.k. HD-1 
gene in tomato was poorly expressed compared to a truncated gene, and no plants that were fully toxic to 
;s tobacco hornworm could be recovered. All three of the above reports indicated much higher expression 
levels and recovery of toxic plants if the respective Q.t.k. genes were truncated. Adang et al. reported that 
the full length HD-73 gene yielded a few tobacco plants with some biological activity (none were highly 
toxic) against hornworm and barely detectable Q.t.k. protein. It was also noted by them that the major Q.t.k. 
mRNA in these plants was a truncated 1.7 kb species that would not encode a functional toxin. This 
20 indicated improper expression of the gene in tobacco. In contrast to all of these reports, the three full length 
Q.t.k. HD-73 genes described above all lead to relatively high levels of protein and high levels of insect 
toxicity. 

Q.t.k. protein and mRNA levels in tobacco plants are shown in Table X for these throe.vectors. As can 
be seen from the table, the synthetic/wild-type gene (pMON 10506) produces Q.t.k. protein as about 0.01% 
^5 of total soluble protein; the synthetic/modified gene produces Q.t.k. as about 0.02% of total soluble protein: 
and the fully synthetic gene produces Q.t.k. as about 0 2% of total soluble protein. Q.t.k. mRNA was 
analyzed in these plants by Northern blot analysis using the common 5 synthetic half of the genes as a 
probe. As shown in Table X. the increased protein levels can largely be attributed to increased mHNA 
levels. Compared to the truncated modified and synthetic genes, this could indicate that the major 
jo contributors to increased translational efficiency are m the 5 half of the gene while the 3 half ol the geno 
contains mostly determinants of mRNA stability. The mcreasod protein levels also mdiccte that increasing 
the amount of the lull length gene that is synthetic or modified increases Q.t.k. protein levels. Compared to 
the truncated synthetic Qt.k. HD-73 g»na S (pMON5383 or pMON5300>. the fully symbolic gone 
(pMONtOStS) produces as much or slightly more Qt.k. protein demonstrating that the full length genes are 
capable of being expressed at high levels in plants. These tobacco plants with high levels of full length HO- 
73 protem show no evidence of abnormality and are fully fertile. The Q.t.k. protein levels m these plants also 
produce the expected levels of insect toxicity based on feeding studies with beet armyworm or diet 
incorporation assays of plant extracts with tobacco budworm. The Qtk. protem detected by Western blot 
analysis m these tobacco plants often contains a varying amount of protem of about 80 kDa which is 
jo apparently a proteolytic fragment ol the full length protem. The C-terminal half of the full length protem ■$ 
known to be proteolytically sensitive, and similar proteolytic fragments are soen from the full length gone m 
E. coli and B.t. itself. Those fragments are fully insecticidal. The Northern analysis indicated that essentially 
all of the mRNA from these full length genes was of the expected full length size. There is no evidence of 
truncated mRNAs that could give rise to the 80 kDa protem fragment In addition, it is possible that the 
45 fragment is not present ;.i intact plant cells and is merely due to piotoolysis during extraction lor 
immunoassay. 
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Table X 


5 


10 


Thus, there is no serious impediment to producing high levels of 8.0 1. HO-73 protein in plants from 
1 5 synthetic genes, and this is expected to be true of other full length leptdopteran active genes such as 8.t.k. 
HO-1 or 8.t . entomocidus. The fully synthetic B.t.k. HO-t gene of Example 3 has been assembled m plant 
transformation vectors such as pMON893. 

The fully synthetic gene in pMON105l8 was also utilised m another plant vector and analysed <n 
tobacco plants. Although the CaMV35S promoter is generally a high level constitutive promoter m most 
20 plant tissues, the expression level of genes driven the CaMV35S promoter is low in floral tissue relative to 
the levels seen in leaf tissue. Because the economically important targets damaged by some insects aro 
the floral parts or derived from floral parts (e g., cotton squares and bolls, tobacco buds, tomato buds and 
fruit), it may be advantageous to increase the expression of B.t. protein m these tissues over that obtained 
with the CaMV35S promoter. 

;s The 35S promoter ol Figwort Mosaic Virus (FMV) is analogous to the CaMV35S promoter This 
promoter has been isolated and engineered into a plant transformation vector analogous to pMON893 
Relative to the CaMV promoter, the FMV 35S promoter ts highly expressed m the floral tissue, while stilt 
providing similar high levels of gene expression m other tissues such as leaf A plant transformation vector. 
pMONIOSi7. was constructed m which the full length synthetic B.t k HD-73 gone of Figuro i f was driven 
m by the FMV 35S promoter This vector is identical to pMONtOOiQ ol Example 3 except that the FMV 
promoter is substituted for the CaMV promoter. Tobacco plants transformed with pMON 1051 7 and 
pMONl05i8 were obtained and compared for expression of tho 8tk. piotom by Westorn blot or EUSA 
immunoassay in leaf and flo**»l tissue. This analysis shnwod that pMONt05t7 containing tho FMV -..itnr 

expressed the full length HO-73 protein at higher levels m Moral tissue than pMON 10518 containing the 
is CaMV promoter Expression of the full length fl.f*. HD-73 protein from pMONl05l7 m ioal tissue «s 
comparable to that seen with the most highly expressing plants containing pMONiOSlB. However when 
floial tissue was analysed, tobacco plants containing pMON 10518 that had hiqh levels of 8. t k, protein m 
loaf tissue did not have detectable B.t.k. protein m the flowors On thn mher hand. Mowers of tobacco plants 
containing pMONl05l7 had levels of B.t.k. prote:n nearly as high as the levels m leaves al appro-.mainly 
*0 0 05% of total soluble protein. This analysis showed that tho FMV promoter could be used to \ "durn 

relatively high levels of B.t.k. protein m floral tissue compared to the CaMV promoter 

b) Tomato. 

45 

The wild-type, modified and Synthetic 8 t.k HD-1 germs tested m tobacco werp introduced *ntn 
plants to demonstrate the broad utility of this invention Transgenic tomatoes were produced winch •■nnt.nn 
these three genes. Data show that the increased expression observed with tho modified and synttmt"- gen*' 
m tobacco also extends to tomato. Whereas the B.t.k. HD-i protein is only barely detectable m plants 
r ,o containing the wild type HD-1 gene (pMON992l). B.t.k HO-t was readily detecied and the mvCs 
determined for plants containing the modified (pMON53'0) or synthetic ipMON5377) genes E-pmssinn 
levels for the plants containing the wild-type, modified and synthetic HD-i genes worn appmximateiv >0 
100 and 500 ng per mg of total plant extract see T^bie XI below) Tho increase m B.t.k. HD-i protom i. r the 
modified gene accounted for the maiority of increase observed. i0 fold higher than tho plants containing the 
55 wild-type gene, compared to only an additional five-fold increase for plants containing tho svmhetn- «;nrm 
Again the site-directed changes made m the modified gone are the maior contributors to the increased 
expression of B.t.k. HD*t 


Full Length 8.0 r. HD-73 Protein and mRNA Levels in Transgenic 
Tobacco Plants 

Gene description 

Vector 

B.t.k. protein 
concentration 

Relative B.t.k. 
mRNA 'evel 

Synthetic/wild type 

pMON10506 

>100 

0.5 

Synthetic/modified 

pMON10526 

400 

1 

Fully synthetic 

pMON10518 

>2000 

40 
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Table XI 


Q.t.k. HD-1 Expression in Transgenic Tomato Plants 

Gene 

Vector 

Q.tX Protein' 

Fold Increase in 

Description 


Concentration 

Q.t.k. Expression 

Wild type 

pMON992l 

10 

1 

Modified 

pMON5370 

100 

10 

Synthetic 

pMON5377 

500 

50 


* Q.t.k. HO*1 protein concentrations are expressed in ng/mg of total soluble plant 
protein. Oata for plants containing the wild-type gene are estimates from mRNA 
levels and protein levels determined b ELISA. 


These differences in Q.t.k. HD-t expression were confirmed with bioassays against tobacco hornworm 
and beet armyworm. Leaves from tomato plants containing each of these genes controlled tobacco 
hornworm damage and produced 100% mortality. With beet armyworm. leaves from plants containing the 
wild-type HD-1 gene (pMON9921) showed significant damage, leaves from plants containing the modified 
gene (pMON5370) showed less damage and leaves from plants containing the synthetic gene (pMON5377) 
were completely protected (see Table XII below). 

Table XII 


Protection of Tomato Plants Irom Tobacco Hornworm and 

Beot Armyworm 

Gene 

Description 

Vector 

Tobacco 

Hornworm 

Damage* 

Beot 

Armyworm 

Damage* 

None 

None 

NL 

NL 

Wild type 

PMONC02! 

0 

3 

Modified 

PMON5370 

0 

l 

Synthetic 

pMON5377 

0 

0 


’ Damage was rated as shown m Table IX. 


jo Tho generality of the synthetic gene approach was extended m tomato with a synthetic flf.A. HD-T3 
gene 

in tomato, extracts from plants containing the wild-type truncated HD-73 gone (pMON5367) showed no 
detectable HD-73 protein. Extracts from plants containing the synthetic HD-73 gene (pMON5383) showed 
high levels of Q.t.k . HO-73 protem. approximately 2000 ng per mg of plant extract protein. Those data 
J5 clearly demonstrate that the changes made m the synthetic HD-73 gene lead to dramatic increases m tho 
expression of the HD-73 protein in tomato as well as m tobacco 

in contrast to tobacco, the synthetic HD-73 gene m tomato is expressed at approximately 4-fold to 5- 
fold higher levels than the synthetic HD-t gene Because the HD-73 protein is about 5-lold more active than 
the HD-t protein against many insect pests including Heliothis specios. tho increased expression of 
oo synthetic HD-73 compared to synthetic HD-i corresponds to about a 25-fold increased insecticidal efficacy 
in tomato. 

in order to determine the mechanisms involved m the increased expression of modified and synthetic 
B.t.k. HD-i genes m tomato. Si nuclease analysis of mRNA levels from transformed tomato plants was 
performed. As indicated above, a similar analysis had been performed with tobacco plants, and this analysis 
55 showed that the modified gene produced up to i0-fdd more mRNA than the wild-typo gene. The analysis m 
tomato utilized a different DNA probe that allowed the analysis of wild-type (pMON992l). modified 
(pMON5370) and synthetic (pMON5377) HD-t genes with the same probe. This probe was derived from the 
5 untranslated region of the CaMV35S promoter m pMON893 that was common to all three ol these 
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vectors (pMON992l. pMON5370 and pMON5377). This Si analysis indicated that B.t.k. mRNA levels from 
the modified gene were 3 to 5 fold higher than for the wild-type gene, and that mRNA levels for the 
synthetic gene were about 2 to 3 fold higher than for the modified gene. Three independent transformants 
were analyzed for each gene. Compared to the fold increases in B.t.k. HO-1 protein from these genes in 
s tomato shown in Table XI. these mRNA increases can explain about half of the total protein increase as was 
seen in tobacco for the wild-type and modified genes. For tomato the total mRNA increase from wild-type to 
synthetic is about 6 to 15 'old compared to a protein increase of about 50 fold. This result is similar to that 
seen for tobacco in comparing the wild-type and modified genes, and it extends to the synthetic gene as 
well. That is. about half of the total fold increase in Q.t.k. protein from wild-type to modified genes can be 
io explained by mRNA increases and about half to enhanced translational, efficiency. The same is also true ir» 
comparing the modified gene to the synthetic gene. Although there is an additional increase in RNA levels, 
this mRNA increase can explain only about half of the total protein increase. 

The full length 8. t.k. genes described above were also used to transform tomato plants and these plants 
were analyzed for B.t.k. protein and insecticidal efficacy. The results of this analysis are shown in Table Xlll. 
is Plants containing the synthetic/wild-type gene (pMON 10506) produce the B.t.k. HO-73 protein at levels of 
about 0.01% of their total soluble protein. Plants containing the synthetic/modified gene (pMONi0526) 
produce about 0.04% 8.t.k. protein, and plants containing the fully synthetic gene (pMON 10518) produce 
about 0.2% B.t.k. protein. These results are very similar to the tobacco plant results for the same genes 
mRNA levels estimated by Northern blot analysis in tomato also increase m parallel with the protein level 
20 inciease. As for tobacco with these three genes, most of the protein increase can be attributed to increased 
mRNA with a small component of translational efficiency increase indicated for the fully synthetic gene. The 
highest levels of full length B.t.k. protein (from pMON 10518) are comparable to or just slightly lower than 
the highest levels observed for the truncated HO-73 genes (pMON5383 and pMON5390). Tomato plants 
expressing these full length genes have the insecticidal activity expected for the observed protein levels as 
25 determined by feeding assays with beet armyworm or by diet incorporation of plant extracts with tobacco 
hornworm. 

Table Xlll 

Full Length B. f X HD-73 Protein and mHNA Levels m Transgenic 
Tomato Plants 

Gene description Voctor B.t.k. ptounn Rotative 3r A 

concentration mRNA level 

Synthetic wild type pMONi0506 10U i 

Synthetic modified pMON 10526 400 2-4 

Fully synthetic pMON 10510 2000 10 

jo 


C) Cotton. 

js The generality of the increased expression of B.t.k . HO-1 and B.t.k HD*73 by use of the modified .m<: 

synthetic genes was extended to cotton. Transgenic calli were produced which contain the wild type 
(pMON9921) and the synthetic HD-i (pMON5377) genes. Here again the B.t.k . HO-t protein produced from 
calli containing the wild-type gene was not detected, whoroas call* containing the synthetic HO-t gene 
expressed the HD-t protein at easily detectable levels The HD-i protein was produced at approximately 
so 1000 ng/mg ol plant calli extract protein Again, to ensure that the protein produced by the tn^jgenic cotton 
calli was biologically active and that the increased expression observed with the synthetic geno translated to 
increased biological activity, extracts ol cotton calli were made m similar manner as described for tobacco 
plants, except that the calli was first dried between Whatman filter paper to remove as much ol the water is 
possible. The dried calli were then ground m liquid nitrogen and ground m i00 mM sodium carbonate 
55 buffer. pH tO. Approximately 0 5 ml aliauotes of this material was applied to tomato leaves with a pamt 
brush. After the leaf dried, five tobacco hornworm larvae wore applied to each of two leaf samples Loaves 
pamted with extract from control calli were completely destroyed. Leaves painted with extract from caifi 
containing the wild-type HD-i gene (pMON992U showed severe damage Leaves pamted with extract from 
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calli containing the synthetic HD-1 gene (pMON5377) showed no damage (see Table XIV below). 


5 

Table XIV 


to 


ts 


2 5 



Control Control NL 
Wild type HD-1 pMON9921 3 
Synthetic HD-1 pMON5377 0 
Synthetic HD-73 pMON5383 0 


Cotton calli were also produced containing another synthetic geno, a gone oncodmg B.tk. HD-73 fho 
70 preparation of this gene is described m Example 3. Calli containing the synthetic HD*73 geno produced the 
corresponding HD-73 protom at even higher levels than the calli which contamod the synthetic HO-t gene. 
Extracts made from calli containing the HD-73 synthetic gene (pMON5383> showed complete control of 
tobacco hornworm when painted onto tomato leaves as described above for extracts containing the HO-t 
protom (Soo Table XIV). 

75 Transgenic cotton plants containing the synthetic B.t.k. HD-l gonu (pMON5377) or :ho synthetic B.tk. 
HD-73 gene fpMON5383) have also been examined These plants produce tho HD-l or HD-73 proteins at 
levels comparable to that seen in cotton callus witn the same genes and comparable to lomaio and tobacco 
plants with these genes. For either synthetic truncated HD-i or HD-73 genes, cotton plants expressing B t.k. 
protein at 1000 to 2000 ng/mg total protein (0.1% to 0.2%) were 'acoverod at a high frequency insect 
40 feeding assays were performed with leaves from cotton plants expressing the synthetic HO-t or HD-73 
genes. These leaves showed no damage (rating of 0) when challenged with larvae of cabbage looper 
(Trichopiusia ni). and only slight damage when challenged with larvae of beet armyworm (Spodoptera 
exigua). Damage ratings are as defined m Table Vlll above This demonstrated that cotton plants as well as 
calli expressed the synthetic HD-t or HD-73 genes at high levels and that those plants were protocted from 
,i 5 damage by Lepidopteran insect larvae 

Transgenic cotton plants containing either the synthetic truncated HD-t gene (pMON5377) or the 
synthetic truncated HD-73 gene (pMON5383) were also assessed for protection against cotton boilworm at 
the whole plant level m the greenhouse This >s a more maiistic test of the ability of these plants to produce 
an agriculturally acceptable level of control. The cotton boilworm (Heliothis zea) is a maior pest of cotton 
so that P f oduces economic damage by destroying terminals, squares and bolls, and prolection of these fruiting 
bodies as well as the leaf tissue will be important for effective msect control and adequate crop protection 
To test the protection afforded to whole plants. Rl orogeny of cotton plants expressing high levels ol either 
B.t.k. HD-1 (pMON5377) or B.t.k. HD-73 (pMON5383) wore assayed by applying 10-tS eggs of cotton 
boilworm per boll or square to the 20 uppermost squares or bolls on each plant. At least )2 plants were 
55 analyzed per treatment. The hatch rate of the eggs was approximately 70%. This corresponds to very high 
insect pressure compared to numbers of larvae per plant seen under typical field conditions Under thoso 
conditions 100% of the bolls on control cotton plants were destroyed by insect damage. For the 
transgemcs. significant boll protection was observed. Plants containing pMON5377 (HD-D had 70-75% ol 
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the bolls survive the intense pressure of this assay. Plants containing pMON5383 (HD-73) had 00% to 90% 
boll protection. This is likely to be a consequence of the higher activity of HD-73 protein against cotton 
bollworm compared to HD-t protein. In cases where the transgenic plants were damaged by the insects, 
the surviving larvae were delayed in their development by at least one instar. 

5 Therefore, the increased expression obtained with the modified and synthetic genes is not limited to 
any one crop; tobacco, tomato and cotton caili and cotton plants all showed drastic increases in 8.t.k. 
expression when the plants/calli were produced containing the modified or synthetic genes. Likewise, the 
utility of changes made to produce the modified and synthetic 8.0 r. HD-t gene .s not limited to the HO-t 
gene. The synthetic HD-73 gene in all three species also showed drastic increases in expression. 
to In summary, it has been demonstrated that: (1) the genetic changes made m the HD-t modified gene 
lead to very significant increases in B.t.k. HD-t expression; (2) production ofa totally synthetic gene lead to 
a further five-fold increase in B.t.k. HD-t expression; (3) the changes incorporated into the modified HD-1 
gene accounted for the majority of the increased B.tk. expression observed with the synthetic gene; (4) the 
increased expression was demonstrated in three different plants -- tobacco plants, tomato plants and cotton 
r 5 calli and cotton plants. (5) the increased expression as observed by Western analysis also correlated with 
similar increases in bioactivity, showing that the 8.t.k. HO-1 proteins produced were comparably active; (6) 
when the method of the present invention used to design the synthetic HD-i gene was employed to design 
a synthetic HD-73 gene it also was expressed at much higher levels in tobacco, tomato and cotton than the 
wild-type equivalent gone with consequent increases in bioactivity; (7) a fully synthetic full length B.t.k . gene 
20 was expressed at levels comparable to synthetic truncated genes. 

Example 5 - Synthetic B.t tenebrionis Gene in Tobacco. Tomato and Potato 


25 Referring to Figure 12. a synthetic gene encoding a Coleopteran active toxin 13 prepared by making the 
indicated changes in the wild-type gene of 8. (. tenebrionis or de novo synthesis of tho synthetic structural 
gene. The synthetic gene is inserted into an intermediate plant transformation /ector such as pMON893 
Plasmid pMON893 containing the synthetic Q.t.t. gene is then inserted into a suitable disarmod Agrobac¬ 
terium strain such as A tumefaciens ACO 
to 

Trai.rfo.mation and Regeneration of Potato 


Sterile shoot cultures ui Russet Burbank are maintained in vials containing to mi of Pm meoium 
js _ (Murashige and Skooq (MS) more amc salts. 30 q/l surcose. 0.17 g.I NaH;P04H'0. 0 4 mg I ihiammo-HCl. 
and 100 mg.I myo-mositol. solidified with 1 g I Gelrito at pH 6.0) When shnots soacned approximately 5 cm 
m length, stem mternode segments of 7 -10 mm are excised and smeared at the cut ends with a disarmed 
Agrobacterium tumefaciens voctor containing the synthetic B.t.t. gone from a four day old plate culture 
The stem explants are co-cultured for three days at 23 C on a ster;!a niter paper placed over 1 5 mi ni a 
jo tobacco cell feeder layer overlaid on rtO P modium (1/10 strength MS mo'gamc salts and organic addenda 
without casein as m Jarret et al. (i960), 30 g.l surcose and 8.0 g/l agar). Following co-culture tho oxpiants 
are transferred to full strongth P-l medium for callus induction, composed of MS inorganic sails, organic 
additions as m Jarret et al (i960) with the exception of casein. 3.0 mg/I bon^yiadenme (BA), and 0 01 mg 1 
naphthaleneacetic acid (NAA) (Jarret. et al.. 1980) Carbentcillin (500 mg l) is included to inhibit bacterial 
js growth, and 100 mg,I kanamycm is added to select for transformed cells. Aftw lour weeks the oxpiants aro 
transferred to modium of tho same composition but with 0.3 mg/I gibbereii'c ac»d (GA3) replacing the BA 
and NAA (Jarret et al.. 1 981) to promote shoot formation Shoots begin to develop approximately two weeks 
after transfer to shoot induction medium; these are excised and transferred to vials of PM medium for 
roofing Shoots are lusted for kanamycm resistance conferred by the enzyme neomycin phosphotransferase 
50 11 . by placing a section of the stem onto callus induction medium containing MS organic and inorganic salts. 

30 9 /1 surcrose. 2.25 mg/I BA. 0 186 mg*l NAA. 10 mg/I GA3 (Webb, et al.. 1983) and 200 mgd kanamycm 
to select for transformed cells. 

The synthetic 8 .t.t. gene described m figure 12. was placed into a plant expression vector as desobed 
in example 5 The plasmid has the following characteristics; a synthetic Bglll fragment having approximately 
5 s 1800 base pairs was inserted into pMON893 m such a manner that the enhanced 35S promoter would 
express the B.t.t. gene. This construct. pMONi982.Swas used to transform both tobacco and tomato 
Tobacco plants, selected as 25 kanamycm resistant plants were screened with rabbit anti-fl.f.r antibody 
Cross-reactive material was detected at levels predicted to bp suitable to cause mortality to CPB These 
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target insects will not feed on tobacco, but the transgenic tobacco plants do demonstrate that the synthetic 
gene does : mprove expression of this protein to detectable levels. 

Tomato plants with the pMON1982 construct were determined to produce 8Xt. protein at lovels 
insecticidal to CPB. In initial studies, the leaves of four plants (5190. 5225. 5328 and 5133) showed little or 
s no damage when exposed to CPB larvae (damage rating of 0-1 on a scale of 0 to 4 with 4 as no leaf 
remaining). Under these conditions the control leaves were completely eaten. Immunological analysis of 
these plants confirmed the presence of material cross-reactive with anti-B.f.f. antibody. Levels of protein 
expression in these plants were estimated at aproximately 1 to 5 ng of Qxt protein in 50 ug cf total 
extractable protein. A total of 17 tomato plants (17 of 65 tested) have been identified which demonstrate 
to protection of leaf tissue from CPB (rating of 0 or 1) and show good insect mortality. 

Results similar to those seen in tobacco and tomato with pMON1962 were seen with pMONl984 in the 
same plant species. pMONl984 is identical to pM0N1982 except that the synthetic protease inhibitor 
(CMTI) is fused upstream of the native proteolytic cleavage site. Levels of expression in tobacco were 
estimated to be similar to pMON1982. between 10-15 ng per 50ug of total soluble protein, 
rs Tomato plants expressing pMONl984 have been identified which protect the leaves from ingestion by 
CPB. The damage rating was 0 with 100% insect mortality. 

Potato was transformed as described in example 5 with a vector similar to pMONt982 containing the 
enhanced CaMV35S/synthetic S.f.f. gene. Leaves of potato plants transformed with this vector, were 
screened by CPB insect bioassay. Of the 35 plants tested, leaves from 4 plants. 16a. 13c. 13d. and 23a 
20 were totally protected when challenged. Insect bioassays with leaves from three other plants. I3e. la. and 
13b. recorded damage levels of 1 on a scale of 0 to 4 with 4 being total devastation of the leaf material. 
Immunological analysis confirmed the presence of B.t.t. cross-reactive material m the 'eat tissue. The level 
of BXt. protein in leaf tissue of plant 16a (damage rating of 0) was estimated at 20-50 ng of 8.t.t. protein/50 
ug of total soluble protein. The levels of BXt. protein seen m 16a tissue was consistent with its biological 
25 activity. Immunological analysis of t3e and 13b (tissue which scored 1 in damage rating) reveal less protein 
(5-10 ng/50 ug of total soluble protein) than in plant 16a. Cuttings of plant t6a were challenged with 50 to 
200 eggs of CPB in a whole plant assay. Under these conditions 16a showed no damage and 100% 
mortality of insects while control potato plants were heavily damagod 

JO 

Example 6 - Synthetic 8 . t.k 92 Protem Gene 

The P2 protein is a distinct insecticidal protein produced by some strains ol B.t. including B. t.k. HD-r it 
.s characterized by its activity against both lepidopteron and dipteran insects (Yamamoto and lizuxa. 
is Genes encoding the P2 protein have been isolated and characterised (Donovan et al. 1988). Tho P2 
proteins encoded by these genes are approximately 600 ammo acids m length. These proteins share only 
limited homology with the lepidopteran specific Pi type proteins, such as the 8 . t.k. HO-I and HD-73 
proteins described in previous examples. 

The P2 proteins have substantial activity against a variety of icp.dopteran larvae including cabbage 
•*o looper. tobacco hornworm and tobacco budworm. Because they are active against agronomically important 
msect pests, the P2 proteins are a desirable candidate m the production ol msoct tolerant transgenic plants 
either alone or in combination with the other B.t. toxins describod m the abovo examples. In somo plants, 
expression of the P2 protein alone might be sufficient to provide protocnon against damaging insects, in 
addition, the P2 proteins might provide protection against agronomically important dipteran pests in other 
js cases, expression of P2 together with the B.t.k. HD-1 or HD-73 protein might be preferred. The P2 proteins 
should provide at least an additive level of insecticidal activity when combined with the crystal protom toxin 
of B.t.k. HD-1 or HD-73, and the combination may even provide a synergistic activity. Although the mode of 
action of the P2 protein is unknown, its distinct amino acid soquonce suggests that it functions differently 
from the B.t.k. HD-1 and HD-73 type of proteins. Production of two msoct tolerance proteins with different 
so modes of action in the same plant would minimize the potential for development of insect resistance to 8 t. 
proteins in plants. The lack of substantial DNA homology between P2 genes and the HO-1 and HD- 73 
genes minimizes the potential for recombination between multiple insect tolerance genes m the plant 
chromosome. 

The genes encoding the P2 protein although distinct m sequence from the B.t.k. HD-i and HD-73 
55 genes share many common foatures with these genes. In particular, the P2 protein genes have a high A * T 
content (65%). multiple potential polyadenylation signal sequences (26) and numerous ATTTA sequences 
(10) Because of its overall similarity to the poorly expressed wild-lypo 8 .t.k. HD-i and HD-73 genes, the 
same problems are expected m expression of the wild-type P2 gene as were encountered with the previous 
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examples. Based on the above-described method for designing the synthetic B.t. genes, a synthetic P2 
gene has been designed which gene should be expressed at adequate levels for protection m plants. A 
comparison of the wild-type and synthetic P2 genes is shown in Figure 13. 


5 

Example 7 - Synthetic B.t. Entomocidus Gene 

The B.t. entomocidus ("Stent") protein is a Jstinct insecticidal protein produced by some strains of 
B.t. bacteria. It is characterized by its high level of activity against some lepidopterans that are relatively 
to insensitive to B.t.H. HD-t and HD-73 such as Spodoptera species including beet army worm (Visser et al.. 
19S8). Genes encoding the Btent protein have been isolated and characterized (Honee et al. 1986). The 
Btent proteins encoded by these genes are approximately the same length as B.t A HD-1 and HO-73. 
These proteins share only 68% amino acid homology with the B.t.k. HO-1 and HD-73 proteins, it is likely 
that only the N-terminal half of the Btent protein is required for insecticidal activity as is the case for HD-i 
is and HD-73. Over the first 625 amino acids. Btent shares only 38% amino aetd homology with HO-1 and HO- 
73. 

Because of their higher activity against Spodoptera species that are relatively insensitive to HO-1 and 
HD-73, the Btent proteins are a desirable candidate for the production of insect tolerant transgenic plants 
either alone or in combination with tho other B.t. toxins described in the above examples, in some plants 
20 production of Btent alone might be sufficient to control the agronomically important pests. In other plants, 
the production of two distinct insect tolerance proteins would provide protection against a wider array of 
insects. Against those insects where both proteins are active, the combination of the B.t.k. HO-1 or HO-73 
type protein plus the Btent protein should provide at least additive insecticidal efficacy, and may even 
provide a synergistic activity. In addition, because of its distinct amino acid sequence, the Btent protein 
2 s may have a different mode of action than HD-i or HO-73. Production of two insecticidal proteins in the 
same plant with different modes of action would minimize the potential for development of insect resistance 
to B.t. proteins m plants. The relative lack of ONA sequence homology with the B.t.k. type genes minimises 
the potential for recombination between multiple insect tolerance genes m the plant chromosome 

The genes encoding the Btent protein although distinct m sequence from tho B.t.k. HD-i and HO-73 
jo genos share many common features with these genes, in particular, the Btent protom genes have a high 
A*T content (62%). multiple potential polyadenylation signal sequences {39 m the full length coding 
sequence and 27 m the first 1875 nucleotides that is likely to encode the active toxic fraqrnont) and 
numerous ATTTA sequences (16 m the full lonqth coding sequence and 12 m the first 1075 nucleotides) 
Because of its overall similarity to the poorly expressed wild type B.t.k. HD-i and HO-"3 genes. »ho witd- 
35 type 8tent genes are expected to exhibit similar problems m expression as were encountered with the witd- 
type HO*i and HD-73 genes. Based on the above-described mothod used for oosigning tho other synthetic 
8 t. genes, a synthetic Btent gene has been designed wmch gene should be expressed at adequate levels 
for protection m plants. A comparison of the wild type and synthetic Btent genes is shown m Figure 14 

40 

Example 6 -- Synthetic B.t.k Genes for Expression m Corn 


High level expression ul heterologous genes m corn cells has been shown to bo enhanced by tho 
presence of a corn gene mtron {Caiiis et al.. 1087) Typically these mtrons have been located m the 5 
45 untranslated region of the jhimenc gene It has been shown that tho CaMV35S promoter and the NOS 3 
end function efficiently m the expression of heterologous genes m corn cells (Fromm et al.. 1986) 

Referring to Figure 15. a plant expression cassette vector (pMON744) was constructed that contains 
these sequences Specifically tho expression cassette contains the enhanced CaMV 35S promoter followed 
by mtron I of the corn Adhl gene (Callis et al.. 1987). This *s followed by a multilmker cloning site fnr 
so insertion of coding sequences: this multilinker contains a BdjUl site among others Following tho multilmker «s 
the NOS 3 end pMON744 also contains the selectable markor gene 35S'NPTHNOS 3 for kanamycm 
selection of transgenic corn cells, in addition. pMON744 has an E. co/r origin of replication and an ampicii'm 
resistance gone lor selection of the plasmid m E. coii 

Five B.t.k. coding sequences described m the previous examples were inserted into the Bgin site of 
55 pMON744 for com cell expression of B.t.k. The coding sequences inserted and resulting vectors wore 

1 Wild type 8 t k. HD-i from pMON992i tc make pMON8652 

2 Modified B.t.k. HD-t from pMON5370 to make pMON8642. 

3 Synthetic B.t k HD-i from pMON5377 to make pMON8643 
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4 Synthetic B.t.k. HD-73 from pMON5390 to make pM0N8644. 

5. Synthetic full length B.t.k. HD-73 from pMONlOStS to make pMON 10902. 

pMON8652 (wild-type B.t.k. HD-1) was used to transform com cell protoplasts and stably transformed 
kanamycin resistant callus was isolated. B.t.k. mRNA in the corn cells was analyzed by nuclease Si 

s protection and found to be present at a level comparable to that seen with the same wild-type coding 

sequence (pMON992l) in transgenic tomato plants. 

pMON8652 and pMON8642 (modified HD-1) were used to transform corn coll protoplasts in a transient 
expression system. The level of B.t.k. mRNA was analyzed by nuclease Si protection. The modified HD-i 
gave rise to a several fold increase in B.t.k. mRNA compared to the wild-type coding sequence in the 

io transiently transformed corn cells. This indicated that the modifications introduced into the B.t.k. HD-1 gene 

are capable of enhancing B.t.k. expression in monocot cells as was demonstrated for dicot plants and cells. 

pMON6642 (modified HD-1) and pMON8643 (synthetic HD-1) were used to transform Black Mexican 
Sweet (BMS) corn cell protoplasts by PEG-mediated ONA uptake, and stably transformed corn callus was 
selected by growth on kanamycin containing plant growth medium. Individual callus colonies that were 
rs derived from single transformed cells were isolated and propagated separately on kanamycin containing 
medium. 

To assess the expression of the B.t.k. genes in these cells, callus samples were tested for insect 
toxicity by bioassay against tobacco hornworm larvae. For each vector. 96 callus lines were tested by 
bioassay. Portions of each callus were placed on sterile water agar plates, and five neonate tobacco 
20 hornworm larvae were added and allowed to feed for 4 days. For pMON8643. 100% of the larvae died after 
feeding on 15 of the 96 calli and these calli showed little feeding damage. For pMON8642. only i of the 96 
cafli was toxic to the larvae. This showed that the B.t.k. gene was being expressed in these samples at 
insecticidal levels. The observation that significantly more calli containing pMONG643 were toxic than for 
pMON8642 showed that significantly higher levels of expression were obtained when the synthetic HD*i 
25 coding sequence was contained in corn cells than when the modified HD-t coding sequence was used, 
similar to the previous examples with dicot plants. A semiquantitative immunoassay showed that the 
pMON8643 toxic samples had significantly higher B.t.k. protein levels than the pMON8642 toxic sample. 

The 16 callus samples that were toxic to tobacco hornworm were also tested for activity against 
European corn borer. European corn borer is approximately 40-fold loss sensitive to the HD-1 gene product 
oo than is tobacco hornworm. Larvae of European corn borer wore applied to the callus samples and allowed 
to foed for 4 days. Two of the 16 calli tested, both of which contained pMON8643 (symhetic HD-1). were 
toxic to European corn borer larvae. 

To assess the oxpross:on of the B.t.k. genes tn differentiated com tissue, another mouiod u* ONA 
delivery was used. Young leaves were excised f'om corn plants, and DNA samples were delivered into me 
35 leaf tissue by mtcroproiectile bombardment in this system, the DNA on the mtcroproiecMes is transiently 
expressed in the leaf cells after bombardment. Three ONA samples were used, and each ONA was tested 
in triplicate 

I. pMON744, the corn expression vector with no B.t.k. gone 

2 pMON8643 (synthetic HD-1). 

40 3. pMON752. a corn expression vector for the GUS gene, no B.t.k gene 

The leaves were incubated at room tomperature for 24 hours. The pMON752 sampios wore stained with 
a substrate that allows visual detection of the GUS gene product. This analysis showed that over one 
hundred spots in each sample were expressing the GUS*product and the the triplicate samples showed 
very similar levels of GUS expression. For the pMON744 and pMON8643 samples 5 larvae of tobacco 
j 5 hornworm were added to each leaf and allowed to feed for 46 hours. All three samples bombarded with 
pMON744 showed extensive feeding damage and no larval mortality. All three samples bombarded with 
pMON8643 showed no evidence of feeding damage and 100% larval mortality. The samples were also 
assayed for the presenco of B.t.k. protein by a qualitative immunoassay All of the pMON8643 samples bad 
detectable B.t.k . protein. These results demonstrated that thn the synthetic B.t.k. gone was expressed <n 
so differentiated corn plant tissue at insecticidal levels 


Example 9 -- Synthetic Potato Leaf Roll Virus Coat Protein Gene 


55 Expression in plants of the coat protein genes from a variety ol plant viruses has proven to bo an 
effective method of engineering resistance to these viruses, in order to achieve vims resistance, it «s 
important to express the viral coat protein at an effective level For many plant virus coat protein genes, this 
has not proved to be a problem. However, for the coat protein gone from potato leaf roll virus (PLRVk 
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expression of the coat protein has been observed to be low relative to other coat protein genes, and this 
lower le**el of protein has not led to optimal resistance to PLRV. 

The gene for PLRV coat protein is shown in Figure 16. Referring to Figure 16. the upper line of 
sequence shows the gene as it was originally engineered for plant expression in vector pMON893. The 
5 gene was contained on a 749 nucleotide Bglll-EcoRI fragment with the coding sequence contained between 
nucleotides 20 and 643. This fragment also contained 19 nucleotides of 5 noncoding sequence and 104 
nucleotides of 3 noncoding sequence. This PLRV coat protein gene was relatively poorly expressed in 
plants compared to other viral coat protein genes. 

A synthetic gene was designed to improve plant expression of the PLRV coat protein. Referring again 
to to Figure 16, the changes made in the synthetic PLRV gene are shown in the lower line. This gene was 
designed to encode exactly the same protein as the naturally occurring gene. Note that the beginning of the 
synthetic gene is at nucleotide 14 and the end of the sequence is at nucleotide 654. The coding seouenco 
for the synthetic gene is from nucleotide 20 to 643 of the figure. The changes indicated just upstream and 
downstream of these endpoints serve only to introduce convenient restriction sites just outside the coding 
15 sequence. Thus tho size of the synthetic gene is 641 nucleotides which is smaller than the naturally 
occurring gene. The synthetic gene is smaller because substantially all of the noncoding sequence at both 
the 5 and 3 ends, except for segments encoding the Bglll and EcoRI restriction sites has been removed. 

The synthetic gene differs from the naturally occurring gene in two mam respects. First. 41 individual 
codons within the coding sequence have been changed to remove nearly all codons for a given ammo acid 
20 that constitute less than about 15% of the codons for that ammo acid in a survey of dicot plant genes 
Second, the 5 and 3 noncoding sequences of the original gene have been removed. Although not strictly 
conforming to the algorithm described in Figure t a few ol the codon changes and especially the removal 
of the long 3 noncoding region is consistent with this algorithm. 

The original PLRV sequence contains two potential plant polyadenylation signals (AACCAA and 
25 AAGCAT) and both of the these occur in the 3 noncoding sequence that has been removed in the synthetic 
gene. The original PLRV gene also contains on ATTTA sequence. This is also contained in the 3 
noncoding sequence, and is in the midst of the longest stretch of uninterrupted A ♦ T m the gene (a stretch 
of 7 A ♦ T nucleotides). This sequence was removed m the synthetic gene. Thus, sequences that the 
algorithm of Figure 1 targets for change have been changed m the synthetic PLRV coat protom gone by 
jo removal of the 3 noncoding segment. Within the coding sequence, codon changos were also made to 
romove throe other regions of sequence described above in particular, two regions of 5 consecutive A ♦ T 
and one region of 5 consecutive G + C withm the coding sequence have been romo'.od m the synthetic 
gene 

The synthetic PLRV coat protein gene is cloned m a plant transformation vecto r such as M .\1CI^333 and 
is used to transform potato plants as described above Those plants express the PLRV coat protein at higher 
levois than achieved with the naturally ocnimng gene, and these plants exhibit mcroasod resistance to 
infection by PLRV 


to Example H) -- Expression of Synthetic 8 t Genes with RUBISCO Small Subunit Promotors and Chioropiast 
Transit Peptides 


The genos in plants encoding the small subunit of RUBISCO (SSU) are olton highly expressed, light 
regulated and sometimes show tissue specificity These expression properties are largely due to me 
45 promoter sequences of these genes. H has been possible to use SSU premotors to express heterologous 
gonos m transformed plants. Typically a plant will contain multiple SSU genes, and the expression levels 
and tissue specificity of different SSU genes will be different. The SSU proteins are encoded m the nucleus 
and synthesized in the cytoplasm as precursors that contain an N-termmal extension known as the 
chioropiast transit peptide (CTP). The CTP directs the procursor to the chioropiast and promotes the uptake 
5o ol the SSU protein into the chioropiast in this process, tho CTP is cleaved from the SSU protein These 
CTP sequences have been used to direct heterologous piotoms into chloropiasts of translurmed plants 

Tho SSU promoters might have several advantages lor expression ol Q.t k. genes m plants Some SSU 
promoters are very highly expressed and could give rise to expression levels as high or higher than those 
observed with the CaMV:tf>$ promoter The tissue distribution ol exprossion from SSU promoters is different 
55 from that of the CaMV35S promoter, so for control of some insect pests, it may be advantageous to direct 
the expression of 8.t.k. to those cells m *hich SSU is most highly expressed. For example, although 
relatively constitutive, in the leaf the CaMV35S promoter <s more highly expressed m vascular tissue than m 
some other parts of the leaf, while most SSU promoters are most highly expressed m the mesophyil cells of 
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the leaf. Some SSU promoters also are more highly tissue specific, so it could be possible to utilize a 
specific S$U promoter to express B.t.k. in only a subset of plant tissues, if for example B.t expression in 
certain cells was found to be deleterious to those cells. For example, for control of Colorado potato beetle in 
potato, it may be advantageous to use SSU promoters to direct B.t.t. expression to the leaves but not to the 
s edible tubers. 

Utilizing SSU CTP sequences to localize 8.1 proteins to the chloroplast might also be advantageous. 
Localization of the B.t. to the chloroplast could protect the protein from proteases found in the cytoplasm. 
This could stabilize the 8 .t. protein and lead to higher levels of accumulation of active protein. 8 .t. genes 
containing the CTP could be used in combination with the SSU promoter or with othor promoters such as 
to CaMV35S. 

A variety of plant, transformation vectors were constructed for the expression of B.tX. genes utilizing 
SSU promoters and SSU CTPs. The promoters and CTPs utilized were from the petunia SSU 1 la gene 
described by Turner et al. (1986) and from the AraPidopsis atslA gene (an SSU gene) described by 
Krebbers et al. (1988) and by Elionor et al. (1989). The petunia SSU1 la promoter was contained on a DNA 
is fragment that extended approximately 800 bp upstream of the SSU coding sequence. The AraPidopsis 
atslA promoter was contained on a ONA fragment that extended approximately 1.8 kb upstream of the SSU 
coding sequence. At the upstream end convenient sites from the multilinker of pUCi8 were used to move 
these promoters into plant transformation vectors such as pMON893. These promoter fragments oxtended 
to the start of the SSU coding sequence at which point an Ncol restriction site was engineered to allow 
20 insertion of the B.t. coding sequence, replacing the SSU coding sequence. 

When SSU promoters were used in combination with their CTP. the ONA fragments extended through 
the coding sequence of the CTP and a small portion of the mature SSU coding sequence at which point an 
Ncol restriction site was engineered by standard techniques to allow the in frame fusion of 8.t. coding 
sequences with the CTP. in particular, for the petunia SSUita CTP. B.t. coding seouences we r e fused to 
25 the SSU sequence after ammo aud 8 of the mature SSU sequence at which point the Ncol site was placed 
The 8 amino acids of mature SSU sequence were included because preliminary in vitro chloroplast uptake 
experiments indicated that uptake was of B.tX. was observed only if this segment of mature SSU was 
included. For the Arabidopsis atslA CTP. the complete CTP was included plus 24 ammo acids of mature 
SSU sequence plus the sequence gly-gly-arg-val*asn-cys*met-gln*ala*met. terminating m an Ncol site for 
jo B.t. fusion. This short sequence reitorates the native SSU CTP cleavage site (between the cys and -not) 
plus a short segment surrounding the cleavage site. This sequence was included m order !o insure proper 
uptake into chloroplasts B.t. coding sequences were fused to this atslA CiP after the mut codon in vitro 
uptake experiments with this CTP construction and other (non-fl.f.) coding sequences showed that this C TP 
- did target proteins to the chloroplast. 

is When CTPs were usod m combination with the CaMV 35S promoter, the same CTP segments won. 
used. They were excised iust upstream of the ATG start sites of tho CTP h y engineering of Ugin sites, and 
placed downstream of the CaMV35S promoter m pMON893. as Bglll to Ncol fragments. 0 f coding 
sequences were fused as described above. 

The wild type B.tX. HD-1 coding sequence of pMON9921 (se<* Figure i) was fused to tho .usiA 
jo promoter to make pMONi925 or the atslA promoter plus CTP to make pMONl92t. These vectors wore 
used to transform tobacco plants, and the plants were screened for activity against tobacco nornwoim No 
toxic plants were recovered. This §s surprising m light of the fact that toxic plants could be recovered, albert 
at a low frequency, after transformation with pMON9921 m which tho B.tX. coding sequence was oxprossod 
from the enhanced CaMV35S. promoter in pMON893. and m light of the fact that Elionor et al (1989) report 
js that the atslA promoter itself ts comparable in strength to the CaMV35S premotor and approximately iQ- 
foid stronger when the CTP sequence *s included. At least for the wild-type B.tX. HD-t coding soquonco. 
this does not appear to be the case. 

A variety of plant transformation vectors were constructed utilizing either the truncated synthetic HL>- ’3 
coding sequence of Figure 4 or the lull length Q.t.k. HO-73 coding sequence ol Figure 11 These am listed 
so m the table below. 
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Table XV 




Gene Constructs with CTPs 

5 

Vector 

■— 

Promoter 

CTP 

B.t.k. HO-73 
Coding Sequence 


pMON 10806 

En 35S 

atsiA 

truncated 


PMON10814 

En35S 

SSUIla 

full length 

to 

pMONIOS11 

SSUIla 

SSUIla 

truncated 

PMON10819 

SSUila 

none 

truncated 


PMON1081S 

atsiA 

none 

truncated 


PMON10817 

atsiA 

atsiA 

truncated 


pMON10821 

En 35S 

atsiA 

truncated 

1 3 

pMON 10822 

En 35S 

atsiA 

full length 

pMON 10838 

SSUIla 

SSUi la 

full length 


pMON 10839 

atsiA 

atsiA 

full length 


20 All of the above vectors were used to transform tobacco plants. For all of the vectors containing 
truncated B.t.k. genes, leaf tissue from these plants has been analyzed for toxicity to insects and B.t.k. 
protein levels by immunoassay. pMONl0806. I08tl. 10819 and 10821 produce levels of Q.t.k. protein 
comparable to pMON5383 and pMON5390 which contain synthetic B.t.k. HD-73 coding sequences driven 
by the En 35S promoter itself with no CTP. These plants also have the insecticidal activity expected for the 
23 B.t.k. protein levels detected. For pMONiOStS and pMONl08l7 (containing the atsiA promoter), the level 
of B. r.k. protein is about 5-fokJ higher than that found in plants containing pMON5383 or 5390. These plants 
also have higher insecticidal activity. Plants containing 10815 and 10817 contain up to i% of their total 
soluble leaf protein as B.t.k . HD-73. This is the highest level of B.t.k. protom yet obtained with any of the 
synthetic genes. 

30 This result IS surprising in two respects. First, as noted above, the wild type coding sequences fused to 
the atsiA promoter and CTP did not show any ovtdence of higher levels of expression than for En 35S. and 
m fact had lower expression based on the absence of any insecticidal plants. Second. Elionor ot al. (1989) 
show that for two other genes, the atsiA CTP can increase expression from the atsiA promoter by about 
10-fold. For the synthetic B.t.k. HD-73 gene, there is no consistent increase seen by including the CTP over 
js and above that seen for the atsiA promoter alone 

Tobacco plants containing the full length synthetic HD-73 fusod to the SSUi tA CTP and driven by the 
En 35S promoter produced levels of B.t.k. protein and insecticidal activity comparable to pMON 1510 which 
contains does not include the CTP. In addition, for pMONl05t8 the B.t.k. protein extracted from plants was 
observed by gel electrophoresis to contain multiple forms loss than iuil length, apparently due the cioavage 
40 of the C-iermmal portion (not required for toxicity) m the cytoplasm. For pMON 10814. the maionty of ihe 
protein appeared to be intact full length indicating that the protein has been stabilized from proteolysis by 
targeting to the chloroplast. 

45 Example ” Targeting o' Bt. Proteins to the Extracellular Space or Vacuolo through the Use of Signal 
Peptides 

The B.t. proteins produced from the synthetic genes, described hero are localized to the cytoplasm of 
the plant cell, and this cytoplasmic localization results m plants that aie insocticidaily effective it may be 
so advantageous for some purposes to direct the B.t. proteins to olhor compartments of ihe plant cell 
Localizing 8t. proteins in compartments other than the cytoplasm may result m less oxposure of the B.t. 
proteins to cytoplasmic proteases leading to greater accumulation of the protein yielding enhanced 
insecticidal activity. Extracellular localization could lead to more efficient exposure ol certain insects to the 
B.t. proteins leading to greater efficacy. If a B.t. protein were lound to be deleterious to plant cell function 
55 then localization to a noncytoplasmic compartment cou'd protect these colls from the protein 

In plants as well as other eucaryotes, proteins that are destined to be localized either extraceilularly or 
m several specific compartments are typically synthesized with an N-iormtnal amino acid extension known 
as the signal peptide. This signal peptide directs the protein to enter the compartmentalization pathway, and 
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it is typically cleaved from the mature protein as an early step in compartmentalization. For an extracellular 
protein, the secretory pathway typically involves cotranslational insertion into the endoplasmic reticulum with 
cleavage of the signal peptide occuring at this stage. The mature protein then passes thru the Golgi body 
into vesicles that fuse with the plasma membrane thus releasing the protein into the extracellular space. 

5 Proteins destined for other compartments follow a similar pathway. For example, proteins that are destined 
for the endoplasmic reticulum or the Golgi body follow this scheme, but they are specifically retained in the 
appropriate compartment. In plants, some proteins are also targeted to the vacuole, another membrane 
bound compartment in the cytoplasam of many plant cells. Vacuole targeted proteins diverge from the 
above pathway at the Golgi body where they enter vesicles that fuse with the vacuole. 
io A common feature of this protein targeting is the signal peptide that initiates the compartmentalization 
process. Fusing a signal peptide to a protein will in many cases lead to the targeting of that protein to the 
endoplasmic reticulum. The efficiency of this step may depend on the sequence of the mature protein itself 
as well. The signals that direct a protein to a specific compartment rather than to the extracellular space are 
not as clearly defined. It appears that many of the signals that direct the protein to specific compartments 
rs are contained within the amino acid sequence of the mature protein. This has been shown for some vacuole 
targeted proteins, but it is not yet possible to define these sequences precisely, it appears that secretion 
into the extracellular space is the "default" pathway for a protein that contains a signal sequence but no 
other compartmentalization signals. Thus, a strategy to direct S.f. proteins out of the cytoplasm is to fuse 
the genes for synthetic fl.f. genes to ONA sequences encoding known plant signal peptides. These fusion 
20 genes will give rise to B. f. proteins that enter the secretory pathway, and lead to extracellular secretion or 
targeting to the vacuole or other compartments. 

Signal sequences for several plant genes have been described. One such sequence is for the tobacco 
pathogenesis related protein PR lb described by Cornelissen et al. The PRib protein is normally localized 
to the extracellular space. Another type of signal peptide is contained on seed storage proteins of legumes 
These proteins are localized to the protein body of seeds, which is a vacuole like compartment found m 
seeds. A signal peptide DNA sequence for the beta subunit of the 7S storage protein of common bean 
(Phaseolus vulgaris). PvuB has been doscribed by Doyle et al Based on the published these published 
sequences, genes were synthesized by chemical synthesis of oligonuciootides that encodod the signal 
peptides for PRtb and PvuB. The synthetic genes for these signal peptides corresponded exactly to iho 
jo reported DNA sequences. Just upstream of the translational intianon codon of each signal peptide a Bam hi 
and BglH site were inserted with the BamHI site al the 5 enn. This allowed the insertion of the signal 
peptide encoding segments into the BgiM site of DMON893 for expression irom the En 35S promotor in 
some cases to achieve secretion or compartmentalization of hoteroiogous proteins, it has proved nocossar T - 
to include some ammo acid sequence beyond the normal cleavage site of the signal peptide. Tins .nay (>♦» 
r>. necessary to insure proper cleavage of the signal peptide. For PRib the synthetic ONA sequonco also 
included the first 10 ammo acids of mature PRib For PvuB the synthetic ONA sequence included tho first 
t3 amino acids of mature PvuB. Both synthetic signal peptide encoding segments ended with Ncol sitos *o 
allow fusion in frame to the methionine initiation codon of the synthetic B t. genes 

Four vectors encoding synthetic. B.t.H . HO-73 genes were const'-ctod containing these signal peptides 
.io The synthetic truncated HD-73 gene from pMON5383 was fused with the signal peptide sequence of PvuB 
and incorporated into pMON093 to create pMONl0827 The synthetic truncated HO-73 gone hnm 
pMON5383 was also fused with the signal peptide sequence ol PRib to create pMON 10024 The full length 
synthetic HD-73 gen© Irom pMONlO5i0 was fused with tho signal peptide sequence of PvuB and 
incorporated mto pMON893 to create pMONi0828. The full length synthetic HD-73 gene from pMONiOSiB 
js was also fused with the signal peptide sequence of PRib and incorporated mto pMON093 to cm.ro 
pMON 10825. 

These vectors were used to transform tobacco plants and the plants wore assayed for expression of tci« 
B.t.H, protein by Western blot analysis and for insecticidal efficacy pMON 10824 and pMON 10827 produced 
amounts of B.t.H . protein in leaf comparable to the truncated HD-73 vectors. pMON5383 and pMON5390 
pMON 10825 and pMON 10828 produced full length 8 .t.H. protom m amounts comparable to pMON 10518 m 
all cases, the plants were msecticidally active against tobacco homwnrm 
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Claims 

;o 1. In a method for improving the expression of a heterologous gene in plants by modifying the structural 
coding sequence of said gene, the improvement which comprises reducing the occurrence of polyadenyla- 
tion signals selected from the group consisting of AATAAA. AATAAT. AACCAA. ATATAA. AATCAA. 
ATACTA. ATAAAA. ATGAAA. AAGCAT, ATTAAT. ATACAT. AAAATA. ATTAAA, AATTAA. AATACA and 
CATAAA. 

75 2. The method of Claim 1 further comprising the improvement of reducing the occurrence of ATTTA 

sequences within the structural coding sequence. 

3. A method for modifying a wild-type structural gene sequence which encodes an insecticidal protom 
of Bacillus thuringiensis to enhance the expression of said protein in plants which comprises: 

a) removing polyadenylation signals contained m said wild-type gone while retaining a sequence 
20 which encodes said protein: and 

b) removing ATTTA sequences contained in said wild-type gene while retaining a sequence which 
encodes said protein. 

4. A method of Claim 3 further comprising the removal of sell-complementary sequences and 
replacement of such sequences with nonself-complementary DNA comprising plant preferred codons while 

25 retaining a structural gene sequence encoding said protein. 

5. A method of Claim 4 further comprising the use of plant preferred sequences m the removal of tho 
polyadenylation signals and ATTTA sequences. 

6. A method of Claim 3 in which the polyadenylation signals are selected from the group consisting of 
AATAAA. AATAAT. AACCAA. ATATAA. AATCAA. ATACTA. ATAAAA. ATGAAA. AAGCAT. ATTAAT. AT- 

jo ACAT. AAAATA. ATTAAA. AATTAA. AATACA and CATAAA. 

7 A method of Claim 4 m which the polyadenylation signals are soioctod from the group consisting >?f 
AATAAA. AATAAT. AACCAA. ATATAA. AATCAA. ATACTA. ATAAAA. ATGAAA. AAGCAT. ATTAAT. AT¬ 
ACAT. AAAATA. ATTAAA. AATTAA. AATACA and CATAAA. 

8. A method of Claim 5 m which the polyadenylation signals are selected from the group r—siv-g of 
l5 AATAAA. AATAAT. AACCAA. ATATAA. AATCAA. ATACTA. ATAAAA. ATGAAA. AAGCAT. ATTAAT. AT¬ 
ACAT. AAAATA. ATTAAA. AATTAA. AATACA and CATAAA. 

9. A method for modifying a wild-type structural gene sequence which encodos an msecticidal protnm 
of Bacillus thuringiensis to enhance the expression of said protein m plants which comprises: 

a) identifying regions within said sequence with greater than tour consecutive adenine or thymine 
40 nucleotides: 

b) modifying the regions of step (a> which have two or moro polyadenylation signals within a tun base 
sequence to remove said signals while maintaining a gene sequence which encodes said protom; and 

c) modifying the 15*30 base regions surrounding the regions of step (a) to remove major plant 
polyadenylation signals, consecutive sequences containing more than one minor polyadenylation signal and 

is consecutive sequences containing more than one ATTTA sequence while maintaining a gone sequence 
which encodes said protein. 

10. A method of Claim 9 in which the maior plant polyadenylation signals are selected from the group 
consisting of AATAAA and AATAAT. 

n A method of Claim 10 in which the polyadenylation signals aro selected from the group consisting 
50 of AATAAA. AATAAT. AACCAA. ATATAA. AATCAA. ATACTA. ATAAAA. ATGAAA. AAGCAT. ATTAAT 
ATACAT. AAAATA. ATTAAA. AATTAA. AATACA and CATAAA 

12. A method ol Claim II further comprising the use ol plant preferred sequences m the removal d 
polyadenylation signals and ATTTA sequences. 

13. A structural gene which encodes an msecticidal protein of Bacillus ihunngiensis. said gene being 
r >5 substantially devoid of polyadenylation signals and ATTTA sequoncos 

M a structural gene of Claim 13 which is substantially devoid of poiyaaenviaoon signals selected irom 
the group consisting of AATAAA, AATAAT. AACCAA. ATATAA. AATCAA. ATACTA. ATAAAA. ATGAAA. 
AAGCAT. ATTAAT. ATACAT. AAAATA. ATTAAA. AATTAA. AATACA and CATAAA. 
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15. A structural gene of Claim 13 which encodes an insecticidal protein of Q.t.k. HD-i having the 
sequence: * 

1 ATGGCTATAGAAACTGGTTACACCCCAATCGATATTTCCT 40 

• • • * 

41 TGTCGCTAACGCAATTTCTTTTGAGTGAATTTGTTCCCGG 80 

* • * 

81 TGCTGGATTTGTGTTAGGACTAGTTGATATTATCTGGGGA 120 

• • • 

121 ATTTTTGGTCCCTCTCAATGGGACGCATTTCTTGTACAAA 160 

• • * ' 

161 TTG AAC AGCTC ATC AACC AG AG AATC G AAG AGTT C G CT AG 200 

• • • * 

201 GAATCAAGCCATTTCTAGATTAGAAGGACTAAGCAATCTr 24 0 

2 41 TATCAAATTTACGCAGAATCTTTTAGAGAG7GGG AAGC AG 280 

2 81 ATCCTACTAATCCAGCATTAAGAGAAGAGATGCGTATTCA 320 

321 ATTCAATGACATGAACAGTGCCCT7ACAACCGC7A77CCT 360 

361 CTTTTTGCAGTTCAAAATTATCAAGTTCCTC7CC7CTCCG 4 00 

401 TGTACGTTCAAGCTGCCAACCTCCACCTCTCAGTTTTGAG 4 40 

4 41 AGATGTTTC AGTGTTTGGACAAAGGTGGGG ATTTGATGCC 4 30 

481 GCGACTATC AA7AG7CGTTATAATGATT7 AAC7 AGGC77 A 520 


/6b 
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521 TTGGCAACTATACAGATCATGCTGTACGCTGGTACAATAC 560 

• • * * 

561 GGGATTAGAGCGTGTATGGGGACCGGATTCTAGAGATTGG 600 

• • • * 

601 ATCAGGTACAACCAGTTCAGAAGAGAGCTTACACTAACTG 640 

• • • • 

641 T ATTAGAT ATCGTTTCTCTATTTCCG AACTATGATAGTAG 680 

• * • * 

681 AACGTATCCAATTCGAACAGTTTCCCAATTAACAAGAGAA 720 

• • • • 

721 ATTTATACAAACCCAGTATTAGAAAATTTTGATGGTAGTT 7 60 

7 61 TTCGAGGCTCGGCTC AGGGC AT AG AAGG AAGTATTAGG AG 800 

• • • 

801 TCCACATTT GATGGATATACTTAATAGTATAACCATCTAT 84 0 

841 ACGGATGC7CATAGAGGAGAA7AC7ACTGGTCCGG7CACC 880 

881 AGA7CA7GGC77C7CC7G7AGGG7777CGGGGCCAGAA77 929 

921 CACTTTTCCGCTATATGGAACTATGGGAAATGCAGC7CCA ' 960 

961 CAACAACGTATTGTTGCTCAACTAGGTCAGGGCGTGTA7A 1C00 

1001 GAACATTA7CGTCCACC77A7A7AGAAGACC7777AACA7 1040 

1041 CGGGATCAACAACCAACAAC7A7C7GT7CT7GACGGGACA 1030 

1081 GAATTTGC7TA7GGAACC7CC7CAAA777GCCA7CCGC7G 1120 


47 
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1121 

TATACAGAAAAAGCGGAACGGTAGATTCGCTGGATGAAAT 

1160 

5 

1161 

. 

ACCGCCACAGAATAACAACGTGCCACCTAGGCAAGGATTT 

1200 


1201 

* # * 

AGTCATCGATTAAGCCATGTTTCAATGTTTCGTTCAGGCT 

1240 

rO 

1241 

• 

TTAGTAATAGTAGTGTAAGTATAATAAGAGCTCCTATGTT 

1280 

/5 

1281 

. 

CTCTTGGATACATCGTAGTGCTGAGTTCAACAACATCATC 

1320 


1321 

CCTTCATCACAAATCACCCAAATCCCACTCACCAAGTCTA 

1360 

:o 

1361 

CTAATCTTGGCTCTGGAACTTCTGTCGTTAAAGGACCAGG 

1400 


1401 

ATTTACAGGAGGAGATATTCTTCGAAGAACTTCACCTGGC 

1440 

25 

1441 

CAGATTTCAACCTTAAGAGTAAATATTACTGCACCATTAT 

1480 

70 

1481 

CACAAAGA7A7CGGG7AAGAA77CGC7ACGC77C7ACCAC 

1520 


1521 

AAACC77CAC77CCACACA7CAA77GACGGAA<'.ACC7A77 

1 5 o 0 

75 

1561 

AA7CAGGGGAA777TTCAGCAAC7A7GAG7 AG7GGG AG7 A 

1600 


1601 

ATTTACAG7CCGGAAGCTT7AGGAC7G7AGG777TAC7AC 

164 0 

JO 

164 1 

7CCGT77AAC7777CAAA7GGA7CAAG7G7 A777ACG77A 

1 6 > 

J5 

1601 

. • 

AG7GCTCA7GTCTTCAATTCAGGCAATGAAGT7TA7A7AG 

1720 


1721 

ATCGAAT7GAA7T7G77CCGGCA 1743. 
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15 A structural gone ol Clai'n 13 whirh encodes an insecticidal protom ot B.t k. HO-'3 having iho 
sequence 

55 
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r i m> hi ■■ m 
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1 ATGGCCATTGAAACCGGTTACACTCCCATCGACATCTCCT 4 0 

• • • 

* 41 TGTCCTTGACACAGTTTCTGCTCAGCGAGTTCGTGCCAGG 80 

• • • * 

81 TGCTGGGTTCGTTCTCGGACTAGTTGACATCATCTGGC-GT 120 

t o * 

121 ATCTTTGGTCCATCTCAATGGGATGCATTCCTGGTGCAAA 160 

161 TTGAGCAGTTGATCAACCAGAGGATCGAAGAGTTCGCCAG 200 

IS 

201 GAACCAGGCCATC7C7AGG77GGAAGGA77GAGCAA7C7C 240 

20 241 TACCAAATCTATGCAGAGAGCTTCAGAGAGTGGGAAGCCG 280 

281 ATCCTACTAACCCAGCTCTCCGCGAGGAAATGCGTATTCA 320 

^5 - 

321 ATTC AAC G AC ATG AAC AGCGC CTT G ACC AC AGC T ATC C C A 3 60 

361 TTGTTCGCAGTCCAGAACTACCAAGTTCCTCTCTTGTCCG 400 

;o 

401 7GTACG7TCAAGCAGC7AA7C77CACC7CAGCG7GC77C0 440 

15 

JO 

45 

50 

55 
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441 AGACGTT AGCGTGTTTGGGCAAAGGTGGGGATTCG ATGCT 4 00 

. . • * 

5 481 GCAACCATCAATAGCCGTTACAACGACCTTACTAGGCTGA 520 

• • • • 

521 TTGGAAACT ACACCGACC ACGCTGTTCGTTGGTACAAC AC 560 

10 .... 

561 TGGCTTGGAGCGTGTCTGGGGTCCTGATTCTAGAGATTGG 600 

• • * • 

, s 601 ATTAGATACAACCAGTTCAGGAGAGAATTGACCCTCACAG 640 

641 TTTTGGACATTGTGTCTCTCTTCCCGAACTATGACTCCAG 680 

• • • 

20 601 AACCTACCCTATCCGTACAGTGTCCCAACTTACCAGAGAA 720 

721 ATCTATACTAACCCAGTTCTTGAGAACTTCGACGGTAGCT 7 60 

23 _ 

761 TCCGTGGTTCTGCCCAAGGTATCGAAGGCTCCATCAGGAG 800 

)0 801 CCCACACT7GATGGACA7CTTGAACAGCA7AAC7A7C7AC 940 

841 ACCGATGC7CACAGAGGAGAGTA77AC7GGTC7GGACACC 980 

881 AGATCATGGCCTCTCCAGTTGGATTCAGCGGGCCCGAGTT 920 

921 TACCTTTCCTCTCTATGGAACTATGGGAAACGCCGCTCCA 960 

to 

961 CAACAACGTATCGT7GCTCAAC7AGG7CAGGG7GTC7ACA 1C00 

->5 1001 GAACCT7G7CTTCCACC7TG7ACAGAAGACCCT7CAA7A7 1040 

50 

50 





tr u s mi a i 


1241 CGGTATCAACAACCAGCAACTTTCCGTTCTTGACGGAACA 1080 

• It* 

s 1081 GAGTTCGCCTATGGAACCTCTTCTAACTTGCCATCCGCTG 1120 

• • • • 

1121 TTTACAG AAAGAGCGGAACCGTTG ATTCCTTGGACG AAAT 1160 

to ■ 

1161 CCCACCACAGAACAACAATGTGCCACCCAGGCAAGGATTC 1200 

• • • • 

1201 TCCCACAGGTTGAGCCACGTGTCCATGTTCCGTTCCGGAT 1240 

15 

1241 TCAGCAACAGTTCCGTGAGCATCATCAGAGCTCCTATGTT 1290 

" 1281 CTCTTGGATACACCGTAGTGCTGAGTTCAACAACATCATC 1320 

1321 GCATCCGATAGTATTACTCAAATCCCTGCAGTGAAGGGAA 1360 

15 .... 

1361 ACTTTCTCTTCAACGGTTCTGTCATTTCAGGACCAGGATT 1400 

1401 CAC7GG7GGAGACC7CG77AGAC7CAACAGCAG7GGAAA7 1440 

10 

1441 AACATTCAGAATAGAGGG7ATATTGAAGTTCCAATTCACT 1430 

15 1481 7CCCATCCACATCTACCAGATA7AGAGTTCGTG7GAGG7A 15 20 

15.21 7GCTTCTGTGACCCCTA7TCACCTCAACGTTAATTGGGGT 15 60 

JO .... 

1561 AAT7CATCCATCTTCTCCAATACAGTTCCAGCTACAGC7A 1600 

J5 1601 CCTCCTTGGATAATCTCCAATCCAGCGATTTCGGTTACTT 1640 

r jQ 
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1641 TGAAAGTGCCAATGCTTTTACATCTTCACTCGGTAACATC 1680 

• • • * 

5 1681 GTGGGTGTTAGAAACTTTAGTGGGACTGCAGGAGTGATTA 1720 

• • • • 

1721 TCGACAGATTCGAGTTCATTCCAGTTACTGCAACACTCGA 17 60 

1761 GGCTGAG 1767. 

17. A structural gene of Claim 13 encoding a insecticidal protein of Q.tX HO*i having the sequence: 
is . 

1 ATGGACAACAACCCAAACATCAACGAATGCATTCCATACA 4 0 

20 41 ACTGCTTGAGTAACCCAGAAGTTGAAGTACTTGGTGGAGA 80 

81 ACGCATTGAAACCGGTTACACTCCCATCGACATCTCCTTG 120 

25 121 TCCTTGACACAGTTTCTGCTCAGCGAGTTCGTGCCAGGTG 160 

161 C7GGG77CG77C7CGGAC7AG77GACA7CA7C7GGGG7A7 2 00 

10 

201 C7T7GG7CCA7C7CAATGGGA7GCA77CC7GG7GCAAA77 2 40 

, 5 241 GAGCAGT7GATCAACCAGAGGATCGAAGAGTTCGCCAGGA 2 30 

281 ACCAGGCCA7C7C7AGG77GGAAGGA77GAGCAA7C7C7A 3 20 

40 

321 CCAAA7C7A7GCAGAGAGC77CAGAGAG7GGGAAGCCGAT 3 60 

45 

50 

55 /Cfa 
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361 CC7ACTAACCCAGC7C7CCGCGAGGAAA7GCG7AT7CAA7 400 

• • * • 

401 TCAACGACATGAACAGCGCCTTGACCACAGCTATCCCATT 440 

441 GTTCGCAGTCCAGAACTACCAAGTTCCTCTCTTGTCCGTG 480 

481 TACGTTCAAGCAGCTAATCTTCACCTCAGCGTGCTTCGAG 520 

521 ACGTTAGCGTGTTTGGGCAAAGGTGGGGATTCGATGCTGC 560 

561 AACCATCAATAGCCGTTACAACGACCTTACTAGGC7GATT 600 

601 GGAAACTACACCGACCACGCTGTTCGTTGGTACAACACTG 640 

641 GCTTGGAGCGTGTCTGGGGTCCTGATTCTAGAGATTGGAT 680 

681 TAGATACAACCAGTTCAGGAGAGAATTGACCCTCACAGTT 720 

721 TTCGACATTGTGTCTCTCTTCCCGAACTATCACTCCAGAA 7,0 

761 CC7ACCC7A7CCG7ACAG7G7CCCAAC77ACCAGAGAAA7 600 

801 C7A7AC7AACCCAG77C77GAGAAC77CGACGG7AGC77C 840 

841 CG7GG77C7GCCCAAGG7A7CGAAGGC7CCA7CAGGAGCC 880 

881 CACAC77GA7GGACA7C77GAACAGCA7AAC7A7C7ACAC 920 

921 CGA7GC7CACAGAGGAGAG7A77AC7GG7C7GGACACCAG 960 
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961 ATCATGGCCTCTCCAGTTGGATTCAGCGGGCCCGAGTTTA 1000 

• • • • 

1001 CCTTTCCTCTCTATGGAACTATGGGAAACGCCGCTCCACA 1040 

• • • * 

1041 AC AACGTATCGTTGCTCAACTAGGTCAGGGTGTCT AC AG A 1080 

• • • 

1081 ACCTTGTCTTCCACCTTGTACAGAAGACCCTTCAATATCG 112C 

• • * * 

1121 GT ATCAAC AACCAGCAACTTTCCGTTCTTG ACGGAAC AG A 1160 

1161 GTTCGCCTATGGAACCTCTTCTAACTTGCCATCCGC7GT7 1200 

1201 TACAGAAAGAGCGGAACCGTTGATTCCTTGGACGAAATCC 1240 

12 41 CACCACAGAACAACAATGTGCCACCCAGGCAAGGATTCTC 12 80 

1281 CCACAGGTTGAGCCACGTGTCCATGTTCCGTTCCGGA7TC 13 20 

13 21 AGCAACAG7TCCGTGAGCATCATCAGAGC7CC7A7G77CT 13 60 

13 61 CATGGATTCATCGTAGTCCTGAG7TCAACAA7A7CA77CC ’ ! 0-' 

1401 TTCCTCTCAAATCACCCAAATCCCATTGACCAAG7CTACT 1-140 

14 41 AACCTTGGATCTGGAACTTCTGTCGTGAAAGGACCAGGCT 14 80 

14 81 TCACAGGAGGTGATATTCTTAGAAGAACTTCTCC7GGCCA 13 20 

1521 GATTAGCACCCTCAGAG77AACATCACTGCACCAC777C7 1560 
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1561 

CAAAGATATCGTGTCAGGATTCGTTACGCATCTACCACTA 

1600 

5 

1601 

* t 1 • 

ACTTGCAATTCCACACCTCCATCGACGGAAGGCCTATCAA 

1640 


1641 

• * * * 

TCAGGGTAACTTCTCCGCAACCATGTCAAGCGGCAGCAAC 

1680 

10 

1681 

TTGCAATCCGGCAGCTTCAGAACCGTCGGTTTCACTACTC 

1720 

1 5 

1721 

CTTTCAACTTCTCTAACGGATCAAGCGTTTTCACCCTTAG 

1760 


1761 

CGCTCATGTGTTCAATTCTGGCAATGAAGTGTACATTGAC 

1 300 

20 

1801 

CG7AT7GAGTT7GTGCC7GCCGAAG77ACC77CGAGGC7G 

1840 


1841 

AGTAC 1845. 



/5 

18. A structural gene ot Claim 13 encoding an insecticidal protein donved from 8. t.k. MO-73 havmij the 
sequence: 

JO 

1 ATGGACAACAACCCAAACATCAACGAAYGCATTCCA'TACA 4: 

41 ACTGCTTGAGTAACCCAGAAG77GAAGTACTTGGTGGAGA 80 

•IS 

81 ACGCATTGAAACCGGTTACACTCCCATCGACATCTCCTTG 120 

121 TCCTTGACACAGTTTCTGCTCAGCGAGTTCGTGCCAGGTG 150 

161 CTGGGTTCGTTCTCGGACTAGTTGACATCATCTGGGGTAT 200 


Vi 
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201 CTTTGGTCCATCTCAATGGGATGCATTCCTGGTGCAAATT 240 

* • • • 

241 GAGCAGTTGATCAACCAGAGGATCGAAGAGTTCGCCAGGA 280 

• • # 1 

281 ACCAGGCCATCTCTAGGTTGGAAGGATTGAGCAATCTCTA 320 

• • • 

321 CCAAATCTATGCAGAGAGCTTCAGAGAGTGGGAAGCCGAT 360 

• • • 

3 61 CCTACTAACCCAGCTCTCCGCGAGGAAATGCGTATTCAAT 400 

• • • 

401 TCAACGACATGAACAGCGCCTTGACCACAGCTATCCCATT 4 4 0 

441 GTTCGCAGTCCAGAACTACCAAGTTCCTCTCTTGTCCG7G 4 00 

. 

481 TACGTTCAAGCAGCTAATCTTCACCTCAGCGTGCTTCGAG 520 

521 ACGTTAGCGTGTTTGGGCAAAGGTGGGGATTCGATGCTGC 5 60 

561 AACCATCAATAGCCG77ACAACGACC77AC7AGGC7GA77 500 

601 GGAAACTACACCGACCACGC7G77CGT7GG7ACAACAC7G 540 

641 GC7TGGAGCGTG7CTGGGG7CC7GATTC7AGAGATTGGA7 630 

681 7AGATAC AACCAGTTC AGG AGAGAATTGACGC7CACAG77 7 20 

721 T7GGACATTGTGTC7C7C77CCCGAAC7ATGAC7CCAGAA 760 

761 CC7ACCC7ATCCGTACAG7CTCCCAAC77ACCAGAGAAAT 600 



r 


801 CTATACTAACCCAGTTCTTGAGAACTTCGACGGTAGCTTC 840 

• • • • 

841 CGTGGTTCTGCCCAAGGTATCGAAGGCTCCATCAGGAGCC 880 

881 CACACTTGATGGACATCTTGAACAGCATAACTATCTACAC 920 

• at » * 

921 CGATGCTCACAGAGGAGAGTATTACTGGTCTGGACACCAG 960 

* • * * 

961 ATCATGGCCTCTCCAGTTGGATTCAGCGGGCCCGAGTTTA 1000 

rs .... 

1001 CCTTTCCTCTCTATGGAACTATGGGAAACGCCGCTCCACA 104 0 

n 1041 ACAACGTATCGTTGCTCAACTAGGTCAGGG7GTCTACAGA 1080 

1081 ACCTTGTCTTCCACCTTGTACAGAAGACCCT7CAA7A7CG 1120 

‘ 5 1121 GTATCAACAACCAGCAACTTTCCGTTCTTGACGGAACAC-A 1160 

1151 GTTCGCCTA7GGAACCTC7TCTAACTTGCCA7CCGC7G77 1200 

)0 

1201 TACAGAAAGAGCGGAACCGT7GATTCCT7GGACGAAA7C7 12 4 0 

, 5 1241 CACCACAGAACAACAATGTGCCACCCAGGCAAGGATTCTC 1280 

1231 CCACAGGTTGAGCCACGTGTCCATGTTCCGT7CCGGA77C 

■*o 

1321 AGCAACAGTTCCGTGAGCATCATCAGAGC7CC7ATGT7CT 13 60 

1361 CTTGGATACACCGTAG7GCTGAGT7CAACAACATCATCGC 1400 

J5 

50 

55 

i'll 
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1401 

• • 

ATCCGATAGTATTACTCAAATCCCTGCAGTGAAGGGAAAC 

1440 

5 

1441 

• • • * 

TTTCTCTTCAACGGTTCTGTCATTTCAGGACCAGGATTCA 

1480 


1481 

• * • * 

CTGGTGGAGACCTCGTTAGACTCAACAGCAGTGGAAATAA 

1520 

to 

1521 

• • • * 

CATTCAGAATAGAGGGTATATTGAAGTTCCAATTCACTTC 

1560 

1 5 

1561 

, • • * 

CCATCCACATCTACCAGATATAGAGTTCGTGTGAGGTATG 

1600 

1601 

• • * * 

CTTCTGTGACCCCTATTCACCTCAACGTTAATTGGGGTAA 

1640 

20 

1641 

TTCATCCATCTTCTCCAATACAGTTCCAGCTACAGCTACC 

1690 


1681 

TCCTTGGATAATCTCCAATCCAGCGATTTCGGTTACTTTG 

1720 

75 

1721 

AAAGTGC C AATGCTTTTAC ATCTTC ACTCGGT AAC ATCGT 

1760 

?0 

1761 

GGGTGTTA 3AAACTTTAGTGGGACTGCAGGAG7GA7TATC 

130C 


1901 

GACAGATTCGAGTTCATTCCAGTTACTGCAACACTCGAGG 

13 4 0 

25 

1941 

CTGAATATAATCTGGAAAGAGCGCAGAAGGCGGTAATGCG 

1390 

JO 

1991 

CTGTTTACGTCTACAAACCAGCTTGGACTCAAGACAAATG 

102C 


19 A structural qono ol Claim 13 uncoiling tho full-length insecticidal protein of 9 t.k HD- 3 having ihn 
sequence 

J5 


50 



58 



tr U 309 AI 


s 


to 


J 5 


20 


25 


JO 


ir> 


*<o 


1 ATGGACAACAACCC AAACATCAACGAATGCATTCC ATAC A 4 0 

* * • • 

41 ACTGCTTGAGTAACCCAGAAGTTGAAGTACT7GG7GGAGA 80 

• • ■ * 

81 ACGCATTGAAACCGGTTACACTCCCATCGACATCTCCTTG 120 

• • • • 

121 TCCTTGACACAGTTTCTGCTCAGCGAGTTCGTGCCAGGTG 160 

161 CTGGGTTCGTTCTCGGACTAGTTGACATCATCTGGGGTAT 200 

201 C777GG7CCA7C7CAA7GGGA7GCA77CC7GG7GCAAA77 2 40 

241 GAGCAGTTGATCAACCAGAGGATCGAAGAGTTCGCCAGGA 280 

281 ACCAGGCCA7C7CTAGG77GGAAGGA77GAGCAA7C7C7A 320 

321 CCAAA7C7A7GCAGAGAGC77CAGAGAG7GGGAAGCCGA7 360 

361 CC7AC7AACCCAGC7C7CCGCGAGGAAA7GCG7A77CAA7 400 

401 7CAACGACA7GAACAGCGCC77GACCACAGC7A7CCCA77 440 

441 (377CGCAG7CCAGAAC7ACCAAG77CC7C7C77G7CCG7G 480 

481 7ACG77CAAGCAGC7AA7C77CACC7CAGCG7GC77CGAG 520 

521 ACG77AGCG7G777GGGCAAAGG7GGGGA77CGA7GCTGC 560 

561 AACCA7CAA7AGCCG77ACAACGACC77AC7AGCCTCA7T 600 




59 



60) 

641 

681 

721 

761 

801 

841 

881 

921 

981 

1001 

104 1 

1081 

1121 

1161 


tel v ««w*, ^ » 


GGAAACTACACCGACCACGCTGTTCGTTGGTACAACACTG 640 
• • • • 

GCTTGGAGCGTGTCTGGGGTCCTGATTCTAGAGATTGGAT 680 

• * * * 

TAGATACAACCAGTTCAGGAGAGAATTGACCCTCACAGTT 720 

TTGGACATTGTGTCTCTCTTCCCGAACTATGACTCCAGAA 7 60 

CCTAC C CTATC CGTACAGTGTCC CAACTT AC C AGAGAAAT 800 

• * * • 

CTATACTAACCCAGTTCTTGAGAACTTCGACGGTAGCTTC 840 

CGTGGTTCTGCCCAAGGTATCGAAGGCTCCATCAGGAGCC 880 

CACACTTGATGGACATCTTGAACAGCATAACTATCTACAC 920 

CGATGCTCACAGAGGAGAGTATTACTGGTCTGGACACCAG 980 

ATCATGGCCTC7CCAG77GGAT7CAGCGGGCCCGAG777A 1C00 

CCTTTCCTCTCTATGGAACTATGGGAAACGCCGCTCCACA 1 j 0 

ACAACGTATCGTTGCTCAACTAGGTCAGGGTGTCTACAGA 1080 

ACCT7GTCTTCC ACCTTGTACAG AAGACCC'f TCAATATCG 112 0 

G7A7CAACAACCAGCAACT7TCCGT7C77CACGGAACAGA 11 60 

G77CGCCTATGGAACCTCT7C7AAC77GCCA7CCGCTG77 120 0 



60 
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1201 TACAGAAAGAGCGGAACCGTTGATTCCTTGGACGAAATCC 1240 

• • * 

1241 CACCAC AGAACAACAATGTGCCACCC AGGCAAGGATTCTC 1200 

. 

1281 CCACAGGTTGAGCCACGTGTCCATGTTCCGTTCCGGA7TC 1320 

. 

1321 AGCAACAGTTCCGTGAGCATCATCAGAGCTCCTATGTTCT 1360 

1361 CTTGGATACACCGTAGTGCTGAGTTCAACAACATCATCGC 1400 

1401 ATCCGATAGTATTACTCAAATCCCTGCAGTGAAGGGAAAC 1440 

1441 TTTCTCTTCAACGGTTCTGTCATTTCAGGACCAGGATTCA 1400 

1481 CTGGTGGAGACCTCGTTAGACTCAACAGCAGTGGAAATAA 1520 

1521 CATTCAGAATAGAGGGTATATTGAAGTTCCAATTCACTTC 1560 

1561 CCATCCACA7CTACCAGATATAGAG7TCC7G7GAGG7A7G .•3.- 

1601 CTTCTG7CACCCC7A77CACC7CAACG77AA77CGGG7AA t! .. 

1641 TTCATCCATCT7C7CCAATACAGTTCC*GC7ACAGCTACC 1680 

1681 TCCTTGGATAATCTCCAATCCAGCGA777CGG7TACT7TG l-i- 

1721 AAAGTGCCAA7GCTTTTACATC7TCAC7CGG7AACA7CG7 l-" 1 ' 

1761 GGGTGTTAGAAAC7T7AG7GGGACTGCAGGAG7GA77A7C 



6i 




1801‘ 

1841 

1881 

1921 

1961 

2001 

2041 

2081 

2121 

2161 

2201 

2241 

2281 

2321 


GACAGATTCGAGTTCATTCCAGTTACTGC AACACTCGAGG 184 0 

• • • • 

CTGAATATAATCTGG AAAGAGCGCAG AAGGCGGTG AATGC 1880 

• • • • 

GCTGTTTACGTCTACAAACCAGCTCGGCCTCAAGACCAAT 1920 
• • • • 

GTGACGGATTATCATATTGATCAAGTGTCC AACTTGGTGA 1960 

CCT ACCTC AGCG ATG AGTTCTGTCTGGATG AAAAGCGAG A 2000 

ATTGTCCGAGAAAGTCAAACATGCGAAGCGACTCAGTGAT 2040 

GAACGCAATTTACTCCAAGATTCAAATTTCAAAGACATTA 2080 

ATAGGCAACCAG AACGTGGGTGGGGCGGAAGT ACAGGG AT 2120 

TACCATCCAGGGAGGTGACGACGTGTTCAAGGAGAACTAC 2160 

GTCACACTATCAGGTACCTTTGATGAGTGCTATCC AACA7 2 2 2 

ACCTCTACCAGAAGATCGACGAGTCCAAGTTGAAAGCCT7 224 ) 

TACCCGTTATCAATTAAGAGGGTATATCGAAGATAGTCAA 2 2 8 0 

GACCTCGAGATCTACCTCATCCGCTACAATGC AAAACATG 2 32 0 

AAACAGTAAATGTGCCAGGTACGGGTTCCTTATGGCCGCT 2 J 60 

TTCAGCCCAAAGTCCAATCGGAAAGTGTGGAGAGCCGAAT 2 40 2 



2361 
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2"401 CGATGCGCGCCACACCTTGAATGGAATCCTGACTTAGATT 2440 

• . • 

2441 GTTCGTGTAGGGATGGAGAAAAGTGTGCCCATCATTCGCA 2480 

* • • 

2481 TCATTTCTCCTTAGACATTGATGTAGGATGTACAGACTTA 2520 

... * * 

2521 AATGAGGACCTAGGTGTATGGGTGATCTTTAAGATTAAGA 2560 

• • • 

2561 CGCAAGATGGGCACGCAAGACTAGGGAATCTAGAGTTTCT 2 600 

2 601 CGAAGAGAAACCATTAGTAGGAGAAGCGCTAGCTCGTGTG 2 54 0 

2641 AAAAGAGCGGAGAAAAAATGGAGAGACAAACGTGAGAAGT 2 680 

2 681 TGGAATGGGAGACCAACATCGTCTACAAAGAGGCAAAAGA 272 0 

27 21 ATCTGTAGATGCTTTATTTGTAAACTCTCAATATGATCAA 2 7 6G 

27 61 TTACAAGCGGATACGAATATTGCCATGATTCATGCGGCAG 2 900 

2301 ATAAACGTGTTCATAGCATTCGAGAAGCTTATCTGCCTGA 234 : 

284 1 GCTGTCTGTGATTCCGGGTGTCAATGCGGC7ATTTTTGAA 298 0 

•2881 G AATTAG AAGGGCGTATTTTC ACTGC ATT CTCCCTCT ACG 2 92 0 

!2921 ATGCCAGAAACGTCATCAAGAACGGTGACTTCAACAATGG 2960 

2 961 CTTATCCTGCTGGAACGTGAAAGGGCATGTAGATGTAGAA 2 00'} 

/77 


63 
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300’ 

GAACAAAACAACCAACGTTCGGTCCTTGTTGTTCCGGAAT 

3040 

5 

3041 

• • • • 

GGGAAGCAGAAGTGTCACAAGAAGTTCGTGTCTGTCCGGG 

3080 


3081 

• • • • 

TCGTGGCTATATCCTTCGTGTCACAGCGTACAAGGAGGGA 

3120 

JO 

3121 

• • • • 

TATGGAGAAGGTTGCGTAACCATTCATGAGATCGAGAACA 

3160 

15 

3161 

• • * * 

ATACAGACGAACTGAAGTTTAGCAACTGCGTAGAAGAGGA 

3200 

3201 

• * • • 

AATCTATCCAAATAACACGGTAACGTGTAATGATTATACT 

3240 

20 

3241 

GTAAATCAAGAAGAATACGGAGGTGCGTACACTTCTCGTA 

3280 


3281 

ATCGAGGATATAACGAAGCTCCTTCCGTACCAGCTGATTA 

3320 

2S 

3321 

• * • * 

TGCGTCAGTCTATGAAGAAAAATCGTATACAGATGGACGA 

3360 

JO 

3361 

AGAGAGAATCCTTGTGAA7TTAACAGAGGGTA7ACGGA77 

2 4 00 


3401 

ACACGCCACTACCAGT7GG7TATGTGACAAAAGAA77AGA 

3 4 4 0 

J5 

3441 

ATACTTCCCAGAAACCGATAAGG7ATGGATTGAGAT7GGA 

3480 


3481 

GAAACGGAAGGAACATT7ATCGTGGACAGCGTGGAA7TAC 

3 5 20 

JO 

3521 

TCCTTATGGAGGAA 3534. 



20 A structural gene Of Claim 13 encoding a full-length insecticidal pnMmn at 3 t H HO*T3 having the 
sequence. 


50 


64 


J 



V 


1 ATGGACAACAACCCAAACATCAACGAATGCATTCCATACA 4 0 

♦ • * * 

41 ACTGCTTGAGTAACCCAGAAGTTGAAGTACTTGGTGGAGA 80 

s 

. 

81 ACGCATTGAAACCGGTTACACTCCCATCGACATCTCCTTG 120 

. 

'° 121 TCCTTGACACAGTTTCTGCTCAGCGAGTTCGTGCCAGGTG 160 

161 CTGGGTTCGTTCTCGGACTAGTTGACATCATCTGGGGTAT 200 

/5 . 

201 CTTTGGTCCATCTCAATGGGATGCATTCCTGG7GCAAATT 24 0 

241 GAGCAGTTGATCAACCAGAGGATCGAAGAGTTCGCCACGA 230 

<0 

281 ACCAGGCCATCTCTAGGTTGGAAGGATTGAGCAATCTCTA 320 

25 321 CCAAATCTATGCAGAGAGCTTCAGAGAGTGGGAAGCCGAT 3 60 

361 CCTACTAACCCAGCTCTCCGCGAGGAAATGCG7AT7CAA7 400 

30 . 

401 TCAACGACATGAACAGCGCCTTGACCACAGCTATCCCATT 4 40 

JS 44 1 GT7CGCAGTCCAGAAC7ACCAAG7TCC7C7C77G7CCG7G 4 30 

•t81 TACGTTCAAGCAGCTAATCTTCACCTCAGCG7GCTTCGAG 620 

40 

45 


50 

m 


65 


.. nr 
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521 

ACGTTAGCGTGTTTGGGCAAAGGTGGGGATTCGATGCTGC 

560 

5 

561 

• • • * 

AACCATCAATAGCCGTTACAACGACCTTACTAGGCTGATT 

600 


601 

• • • • 

GGAAACTACACCGACCACGCTGTTCGTTGGTACAACACTG 

640 

JO 

641 

• • • * 

GCTTGGAGCGTGTCTGGGGTCCTGATTCTAGAGATTGGAT 

680 

JS 

681 

TAGATACAACCAGTTCAGGAGAGAATTGACCCTCACAGTT 

720 


721 

TTGGACATTGTGTCTCTCTTCCCGAACTATGACTCCAGAA 

760 

:o 

761 

CCTACCCTATCCGTACAGTGTCCCAACTTACCAGAGAAAT 

800 


801 

CTATACTAACCCAGTTCTTGAGAACTTCGACGGTAGCTTC 

940 

« i 

841 

CGTGGTTCTGCCCAAGGTATCGAAGGCTCCATCAGGAGCC 

980 

10 

881 

CACACTTGATGGACATC77GAACAGCATAAC7ATCTACAC 

9 20 


921 

CGATGCTCACAGAGGAGAGTA77AC7GGTC70GACACCAG 

?60 


961 

ATCATGGCCTCTCCAGTTGGATTCAGCGGGCCGIAGTTTA 

1000 

40 

1001 

CCTTTCCTCTCTATGGAACTATGGGAAACGCCGCTCCACA 

1040 


1041 

ACAACGTATCGT7GCTCAACTAGGTCAGGGTG7CTACAGA 

10SC 

IS 

1081 

ACCTTGTCTTCCACCTTG7ACAGAAGACCC77CAA7A7CG 

u:e 

so 


/0 



66 


1121 GTATCAACAACC AGCAACTTTCCGTTCTTGACGGAAC AGA 1160 

* * 

1161 GTTCGCCTATGGAACCTCTTCTAACTTGCCATCCGCTGTT 1200 

1201 TAC AGAAAG AGCGGAACCGTTGATTCCTTGGACGAAATCC 124 0 

1241 CACCACAGAACAACAATGTGCCACCCAGGCAAGGATTCTC 1280 

1281 CCACAGGTTGAGCCACGTGTCCATGTTCCGTTCCGGATTC 1320 

. • • 

1321 AGCAACAGTTCCGTGAGCATCATCAGAGCTCCTATGTTCT 13 60 

1361 CTTGGATACACCGTAGTGCTGAGTTCAACAACATCATCGC 1400 

1401 ATCCGATAGTATTACTCAAATCCCTGCAGTGAAGGGAAAC 14 4 0 

1441 TTTCTCTTCAACGGTTCTGTCATTTCAGGACCAGGATTCA 1480 

1481 CTGGTGGAGACCTCGTTAGACTCAACAGCAG7GGAAA7AA 1520 

1521 CATTCAGAATAGAGGG7A7ATTGAAGTTCCAA7TCAC77C 15cC 

1561 CCA7CCACATCTACCAGATATAGAGTTCG7GTGAGG7A7G 1600 

1601 CT7CTGTGACCCCTATTCACCTCAACGT7AAT7GGGGTAA 164 0 

1641 TTCATCCATCT7CTCCAATACAGTTCCAGCTACAGC7ACC 1680 

1681 TCC7TGGATAATC7CCAATCCAGCGATT7CGG77AC7T7G 17 20 

/V 
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1721 . AAAGTGCCAATGCTTTTACATCTTCACTCGGTAACATCGT 17 60 

• * i » 

5 1761 GGGTGTTAGAAACTTTAGTGGGACTGCAGGAGTGATTATC 1800 

• * • • 

1801 GAC AGATTCG AGTTCATTCCAGTTACTGC AAC ACTC G AGG 1840 

• « t ■ 

'° 1841 CTGAATATAATCTGGAAAGAGCGCAGAAGGCGGTGAATGC 1880 

1881 GCTGTTT ACGTCTAC AAACCAACTAGGGCTAAAAAC AAAT 1920 

15 

1921 GTAACGGATTATCATATTGATCAAGTGTCCAATTTAGTTA 1960 

ro 1961 CGTATT7ATCGGATGAATTTTGTCTGGATGAAAAGCGAGA 2000 

2001 ATTGTCCGAGAAAGTCAAACATGCGAAGCGACTCAGTGAT 204 0 

?5 

2041 GAACGCAATTTAC7CCAAGATTCAAATTTCAAAGACA77A 2080 

2081 ATAGGCAACCAGAACGTGGG7GGGGCGGAAGTACAGGGAT 212 0 

30 

2121 TACCATCCAAGGAGGGGATGACGTATTTAAAGAAAAT7AC 2L60 

J5 " 2161 GTCACAC7ATCAGGTACC7T7GATGAG7GC7A7CCAACA7 2200 

2201 AT7TGTA7CAAAAAATCGATGAA7CAAAA7TAAAAGCC7T 22 40 

40 .... 

22 41 7ACCCG7TATCAA77AAGAGGG7A7ATCGAAGATAG7CAA 2 2 80 

2281 GAC7TAGAAATC7A7TTAAT7CGC7ACAA7GCAAAACATG 2 320 

45 

50 



68 


cr u x>o 


5 


JO 


J5 


20 


25 


JO 


15 


40 


45 


2321 AAACAGTAAATGTGCCAGGTACGGGTTCCTTATGGCCGCT 
• • • 

2361 TTCAGCCCAAAGTCCAATCGGAAAGTGTGGAGAGCCGAAT 

, t • • 

2401 CGATGCGCGCCACACCTTGAATGGAATCCTGACTTAGATT 
. • * • 

2441 GTTCGTGTAGGGATGGAGAAAAGTGTGCCCATCATTCGCA 

2481 TCATTTCTCCTTAGACATTGATGTAGGATGTACAGACTTA 
• • * * 
2521 AATGAGGACCTAGGTGTATGGGTGATCTTTAAGATTAAGA 

• • * 

2561 CGCAAGATGGGCACGCAAGACTAGGGAATCTAGAGTTTCT 
• * * * 

2601 CGAAGAGAAACCATTAGTAGGAGAAGCGCTAGCTCGTGTG 

2641 AAAAGAGCGGAGAAAAAATGGAGAGACAAACGTGAAAAAT 

2681 TGGAATGGGAAACAAATATCGTTTATAAAGAGGCAAAAGA 
2721 ATCTGTAGATGCTTTATTTGTAAACTCTCAATATGATCAA 
2761 TTACAAGCGGATACGAATATTGCCATGATTCATGCGGCAG 
2801 ATAAACGTGTTCATAGOATTCGAGAAGCTTATCTGCCTGA 
2841 GCTGTCTGTGATTCCGGGTGTCAATGCGGCTATTTTTGAA 
2881 GAATTAGAAGGGCGTATTTTCACTGCATTCTC CCT AT ATG 


50 


55 

/& 


2360 

2400 . 
2440 

2480 

2520 

2560 

2600 

2640 

2680 

2720 

2 7 oO 

2900 

2840 

2930 

2920 


69 



wr s* vww av* n • 



2521 

ATGCGAGAAATGTCATTAAAAATGGTGATTTTAATAATGG 

2960 

5 

2961 

* • # * 

CTTATCCTGCTGGAACGTGAAAGGGCATGTAGATGTAGAA 

3000 


3001 

t • • 1 

GAACAAAACAACCAACGTTCGGTCCTTGTTGTTCCGGAAT 

3040 

ro 

3041 

• * * * 

GGGAAGCAGAAGTGTCACAAGAAGTTCGTGTCTGTCCGGG 

3080 

15 

3081 

TCGTGGCTATATCCTTCGTGTCACAGCGTACAAGGAGGGA 

3120 


3121 

TATGGAGAAGGTTGCGTAACCATTCATGAGATCGAGAACA 

3160 

20 

3161 

ATACAGACGAACTGAAGTTTAGCAACTGCGTAGAAGAGGA 

3200 


3201 

AATCTATCC AAAT AAC ACGGTAACGTGTAATG ATT AT ACT 

324 0 

25 

3241 

GTAAATCAAGAAGAATACGGAGGTGCGTACACTTCTCGTA 

329 0 

30 

3291 

ATCGAGGATA7AACGAAGCTCCTTCCG7ACCAGC70A77A 

3 3 2 : 


3321 

tgcgtcagtctatgaagaaaaatcgtatacagatgga.;:a 

3 3 6 0 

IS 

3361 

AGACAGAATCCTTGTGAATTTAACAGAGGGTATAGGGATT 

3400 

JO 

3401 

ACACGCCACTACCAGTTGGTTATGTGACAAAAGAATTAGA 

344 0 

344 1 

ATACTTCCCAGAAACCGATAAGGTATGGATTGAGATTGGA 

3 4 8 0 

*»5 

3481 

GAAACGGAAGGAACAT7TA7CGTGGACAGCG7GGAA77AC 

3520 

SO 

3521 

TCCTTATGGAGGAA 3534. 



21. A structural gene of Claim 13 encoding a lull-length insecticidal piotem of 9 t k, HD-73 havmtj (ho 
seguonca 

55 


70 
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1 ATGGACAACAACCCAAACATCAACGAATGCATTCCATACA 40 

41 ACTGCTTGAGTAACCCAGAAGTTGAAGTACTTGGTGGAGA 80 

. 

81 ACGCATTGAAACCGGTTACACTCCCATCGACATCTCCTTG 120 

. 

121 TCCTTGACACAGTTTCTGCTCAGCGAGTTCGTGCCAGGTG 160 

161 CTGGGTTCGTTCTCGGACTAGTTGACATCATCTGGGGTAT. 200 

»s .... 

201 CTTTGGTCCATCTCAATGGGATGCATTCCTGGTGCAAAT7 240 

20 241 GAGCAGTTGATCAACCAGAGGATCGAAGAGTTCGCCAGGA 230 

281 ACCAGGCCATCTCTAGGTTGGAAGGATTGAGCAATCTCTA 320 

” 321 CCAAATCTATGCAGAGAGCTTCAGAGAGTGGGAAGCCGAT 3 60 

361 CCTACTAACCCAGCTCTCCGCGAGGAAATGCG7ATTCAAT 40C 

30 

401 TCAACGACATGAACAGCGCCTTGACCACAGCTATCCCATT 440 

35 

40 

45 

50 

/ 0 


ip ■ ■ ■ i M |i 


111 "'M. " ■ 


r 


441 


GTTCGCAGTCCAG AACTACCAAGTTCCTCTCTTGTCCGTG 4 80 

• • • * 

481 TACGTTCAAGCAGCTAATCTTCACCTCAGCGTGCTTCGAG 520 

« • * * 

521 ACGTTAGCGTGTTTGGGCAAAGGTGGGGATTCGATGCTGC 5 60 

• • • * 

561 AACCATCAATAGCCGTTACAACGACCTTACTAGGCTGATT 600 

• • • 

601 GG AAACT AC AC CG ACC ACGCTGTTC GTTGGT AC AAC ACTG 640 

641 GCTTGGAGCGTGTCTGGGGTCCTGATTCTAGAGAT7GGA7 6 80 

681 TAGATACAACCAGTTCAGGAGAGAATTGACCCTCACAGT7 7 20 

721 TTGGACATTGTGTCTC7CTTCCCGAACTATGACTCCAGAA 7 60 

7 61 CCTACCCTATCCG7ACAGTGTCCCAACTTACCAGAGAAA7 3C0 

801 C7atactaacccag7TC77Gagaacttcgacgg7A3.:tt': -:;j 

3 4 1 CG7GG7TCTGCCCAAGGTATCGAAGGC7CCA7CAGGAGG3 : ^ 3 

8 81 CACACTTG ATGGACATCTTGAACAGCATAACTA7C7AC AC 3 3 0 

921 CGATGCTCACAGAGGAGAGTATTACTGGTCT^GACACCAG 960 

961 ATCA7GGCCTC7CCAGTTGGATTCAGCGGGCCGGA:777A '.nc.Q 

CCTTTCCTC7CTATGGAACTATGGGAAACGCCCC7 ICACA 1>34C 


1001 



tr w wvw jva « • 


5 


10 


15 


20 


25 


:o 


)5 


V 


-m 




1041 ACAACGTATCGTTGCTC AACTAGGTCAGGGTGTC7AC AGA 1080 

• . • 

1081 ACCTTGTCTTCCACCTTGTACAGAAGACCCTTCAATATCG 1120 

• • • 

1121 GTATCAACAACCAGCAACTTTCCGTTCTTGACGGAACAGA 1160 

* * * * 

1161 GTTCGCCTATGGAACCTCTTCTAACTTGCCATCCGCTGTT 1200 

1201 TACAGAAAGAGCGGAACCGTTGATTCCTTGGACGAAATCC 12 40 

1241 CACCACAGAACAACAATGTGCCACCCAGGCAAGGATTCTC 1280 

1281 CCACAGGTTGAGCCACGTGTCCATGTTCCGTTCCGGATTC 1320 

1321 AGCAACAGTTCCQTGAGCA7CA7CAGAGCTCCTATGTTC7 1360 

1361 CTTGGATACACCGTAG7GCTGAGTTCAACAACATCATCGC 1400 

14 01 A7CCGATAGTAT7AC7CAAA7CCCT3CAG7GAAGGGAAAC 14 4-. 

14 4 1 TTTCTCTTCAACGGTTCTGTCATTTCAGGACJAGGATTCA i 

1481 CTGGTGGAGACCTCGTTAGACTCAACAGCAGTGGAAATAA 1520 

1521 CATTCAGAATAGAGGGTA7A7TGAAG7TCCAA7TCAC77C 1560 

1561 CCA7CCACATCTACCAGA7A7AGAG7TCG7GTGAGGTATG 1 600 

1601 CTTCTGTGACCCCTAT7CACC7CAACG77AATTGGGG7AA 16 4.' 


r >0 


55 

/n 
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1641 

TTCATCCATCTTCTCCAATACAGTTCCAGCTACAGCTACC 

1680 

s 

1681 

TCCTTGGATAATCTCCAATCCAGCGATTTCGGTTACTTTG 

1720 

>0 

1721 

• • • • 

AAAGTGCCAATGCTTT'TACATCTTCACTCGGTAACATCGT 

1760 

1761 

• * • • 

GGGTGTTAGAAACTTTAGTGGGACTGCAGGAGTGAT7ATC 

1800 

/ 5 

1801 

GACAGATTCGAGTTCATTCCAGTTACTGCAACACTCGAGG 

1340 


1841 

CTGAGTACAACCTTGAGAGAGCCCAGAAGGCTGTGAACCC 

1380 

20 

1881 

CCTCTTTACCTCCACCAATCAGC7TGGCTTGAAAACTAAC 

1920 

25 

1921 

GTTACTGACTATCACAT7GACCAAG7GTCCAACTTGG7CA 

I960 


1961 

CCTACCTTAGCGATGAG77CTGCC7CGACGAGAAGCC7CA 

ZOO 0 

m 

2 o C 1 

AC7CTCCGAoAAAw.7 . AAA w Aw G wC AAGwG i 1 - * w AGw *.*nw 

:: *;' 


2 0 4 1 

GAGAGGAATC7C7TGG1 4.;A rTCCAACTTCAAAGAOAT 2A 


)5 

2081 

ACAGGCAGCCAGAACG7GGT7GGGGTGGAAGCACCGGGAT 

:::o 

JO 

2121 

CACCATCCAAGGAGGCGACGA7GTG7TCAAGGAGAAC7AC 

2160 


2 161 

GTCACCCTC7CCGGAAC777CGACGAG7GCTACCC7ACC7 

■- : *- 

J5 

220 1 

ACTTGTACCACAAGA7CGA7GAGTCCAAACTCAAAGCG7T 

:: ■; o 


//r 
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s 


10 


15 


20 


25 


10 


35 


JO 


>5 


2241 

2281 

2321 

2361 

2401 

2441 

2481 

2521 

2561 

2 60 1 

2 64 1 

2681 

2721 

2761 

2801 


CACCAGGTATCAACTTAGAGGCTACATCGAAGACAGCCAA 
• • • 
GACCTTGAAATCTACTCGATCAGGTACAATGCCAAGCACG 
. 

AGACCGTGAATGTCCCAGGTACTGGTTCCCTCTGGCCACT 
TTCTGCCCAATCTCCCATTGGGAAGT3TGGAGAGCCTAAC 
AGATGCGCTCCACACCTTGAGTGGAATCCTGACTTGGACT 
GCTCCTGC AGGGATGGCGAGAAGTG7GCCC ACCA77C TC A 
TCACTTCTCCTTGGACATCGATGTGGGATGTACTGACCTG 
AATGAGGACCTCGGAGTCTGGGTCATCTTCAAGATCAAGA 
CCCAAGACGGACACGCAAGACTTGGCAACCTTGAGTTTCT 
CGAAGAGAAACCA77GG7CGG7GAAGC7 '7C0G7CG7 373 
AAGAGAGCAGAGAAGAAGTGGAu-j'jnv-AuAAL. .i. .j *vj 
7CGAA7GGGAAAC7AACA7CG777ACAAGGAGGCCAAAGA 
GTCCGTGGATGCTTTGTTCGTGAACTCO-'AATATGATCAG 
TTGCAAGCCGACACCAACATCGCCA7GA TCCACGCC 3CAG 
ACAAACG7G7GCACAGCA77CG lGAuUv- . -jA 


2280 

2320 

2360 

2400 

24 4C 

2480 

2520 

25 60 

2600 


50 


ft? 


75 



cr u ooo mu a i 


3 


'0 


1 3 


JO 


J5 


JO 


J5 


iO 




2841' GTTGTCCGTGATCCCTGGTGTGAACGCTGCCATCTTCGAG 2880 

• * • • 

2881 GAACTTGAGGGACGTATCTTTACCGCATTCTCCTTGTACG 2920 

• • • • 

2921 ATGCCAGAAACGTCATCAAGAACGGTGACTTCAACAATGG 2960 

2961 CCTCAGCTGCTGGAATGTGAAAGGTCATGTGGACGTGGAG 3000 

3001 GAACAGAACAATCAGCGTTCCGTCCTGGTTGTGCCTGAGT 3040 

3041 GGGAAGCTGAAGTGTCCCAAGAGGTTAGAGTCTGTCCAGG 3030 

3081 TAGAGGCTACATTCTCCGTGTGACCGCTTACAAGGAGGGA 3120 

3121 TACGGT.GAGGGTTGCGTGACCATCCACGAGATCGAGAACA 3160 

3161 ACACCGACGAGCTTAAGTTCTCCAACTGCGTCGAGGAAGA 3200 

3 20 l AATCTATCCCAACAACACCGTTACTTGCAACGACTATAGT 2 2 4 2 

J •-5 1 'jTGAATCAG^jAAGAGTACGuAGGTGCCTACACTAuv. r JTA j J j 

32 81 ACAGAGGTTACAACGAAGCTCCTTCCGTTCCTGCTCACTA 3 320 

33 2 1 TGCCTCCGTGTACGAGGAGAAATCCTACACAGATGGCAGA 3 3 60 

33 6 1 CGTGAGAACCCTTGCGAGTTCAACAGAGGTTACAGGGAC7 3400 

34 0 1 ACACACCACTTCCAGTTGGCTATGTTACCAAGGAGC77GA 3 4 40 


34 4 1 GTACTTTCCTGAGACCGACAAAGTGTGGATCGAGATCGGT 3480 

34 81 GAAACCGAGGGAACCTTCATCGTGGACAGCGTGGAGCTTC 3 5 20 

3521 TCTTGATGGAGGAA 3534. 

:r, 
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2~. A structural gene of Claim 13 which encodes an insecticidal protein of B. 1.1. having the sequence: 

• • • • 

1 ATGACTGCAGACAACAACACCGAAGCCCTCGACAGTTCTA 40 

s 

41 CCACTAAGGATGTTATCCAGAAGGGTATCTCCGTTGTGGG 80 

t I • « 

10 81 AGACCTCTTGGGCGTGGTTGGATTTCCCTTCGGTGGAGCC 120 

121 ' CTCGTGAGCTTCTATACAAACTTTCTCAACACCATTTGGC 160 

15 .... 

161 CAAGCGAGGACCCTTGGAAAGCATTCATGGAGCAAGTTGA 200 

201 AGCTCTTATGGATCAGAAGATTGCAGATTATGCCAAGAAC 2 40 

20 

241 AAGGCTTTGGCAGAACTCCAGGGCCTTCAGAACAATGTGG 230 

n 281 AGGACTACGTGAGTGCATTGTCCAGCTGGCAGAAGAACCC 3 20 

321 TGTTAGCTCCAGAAATCCTCACAGCCAAGGTAGGATCAGA 2 60 

)0 .... 

3 61 GAGT70TTCTCTCAAGCCGAATCCCACTTCAGAAA77C CA •!.: ' 

»5 

JO 

\ 

J5 

SO 

55 

hi 
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9 


to 


1 5 


20 


25 


W 


40 


401 TGCCTAGC7T7GC7A7CTCCGG77ACGAGG77C7777CC7 440 

• • * . 

441 CACTACC7ATGC7CAAGC7GCCAAC ACCC AC77G777C7C 480 

4 81 CTTAAGGACGCTC AAATCTATGGAGAAGAGTGGGGAT ACG 520 

521 AGAAAGAGGACATTGCTGAGTTCTACAAGCGTCAACTTAA 560 

561 GCTCACCCAAGAGTACACTGACCATTGCGTGAAATGGTAT 500 

601 AACG77GG7C7CGA7AAGC7CAGAGGC7C77CC7ACGAG7 54C 

641 C77GGG7GAAC77CAACAGA7ACAGGAGAGAGA7GACC77 630 

681 GAC7G7GC7CGA7C77A7CGCAC7C777CCCT7GTACGA7 720 

721 G7GAGAC7C7ACCCAAAGGAAG7GAAAAC7GAGC77ACCA '40 

» j AG AC y i y y > y Ay »^ j Ay y y 7 A i T j i y vj' j A o * ,A,*\ y A^\ *, , , j , 

^ y * ACjyjyjoT iAi vju AAy 7 AC y i . C A* j v_ AA . A * y j AAAAy , At. ■* i 

841 A77AGGAAACCACA7C7C77CGAC7A7C77CACA0AA77C 44n 

881 AA77CCACACAAGG777CAACCAGGA7AC7A7GG7AACGA 320 

921 C7CC77CAAC7A7TGG7CCGG7AAC7A7G777CCAO.W ',A >40 

9 6 * y-y. AAGCA7TG>jA i C7 AA . ’ j A*. A . >. A7C A*. A7 - ’. 7* *. >. . k T 



F 


t 


.'8 
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1441 

AATGGCACCATGATTCAs. JTTGCACCAGAAGATTACACTG 

1480 

5 

1481 

• • • • 

GATTCACCATCTCTCCAATCCATGCTACCCAAGTGAACAA 

1520 

10 

1521 

* • * * 

• 

TCAGACACGCACCTTCATCTCCGAAAAGTTCGGAAATCAA 

1560 


1561 

• • • • 

GGTGACTCCTTGAGGTTCGAGCAATCCAACACTACCGCTA 

1600 

IS 

1601 

GGTACACTTTGAGAGGCAATGGAAACAGCTACAACCTTTA 

16.40 


1641 

CTTGAGAGTTAGCTCCATTGGTAACTCCACCATCCGTGTT 

1680 

20 

1681 

ACCATCAACGGACGTGTTTACACAGTCTCTAATGTGAACA 

1720 

25 

1721 

CTACAACGAACAATGATGGCG7TAACGACAACGGAGCCAG 

17 60 


1761 

ATTCAGCGACATCAACATTGGCAACATCGTGGCCTCTGAC 

1300 

70 

1801 

AACACTAACGTTACTTTCGACATCAATGTGACCCTCAATT 

i3-;o 


184 1 

CTGGAACTCCAT7TGATCTCA7GAACATCATCTT7C7GCC 

18 -o 


1881 

AACTAACCTCCCTCCATTGTACTAA 1905. 


■JO 

25 

p lane 

A plant transformation vector compris 

gene containing a structural qene of Claim 

i r. 

1 J . 
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16C1 GAGCACCCTTCAACCAGTATTACTTTGACAAGACCATCAA 1640 

* 

1641 CAAAGGTGACACTCTCACATACAATAGCTTCAACTTGGCA 1680 

1681 AGTTTCAGCACACCATTTGAACTCTCAGGCAACAATCT’i’C 1720 

10 

• • , 

1721 AGATCGGCGTCACCGGTCTCAGCGCCGGAGACAAAGTCTA 1760 

rs 1761 CATCGACAAGATTGAGTTCATCCCAGTGAAC 1791. 

23. A structural gone ol Claim 13 which encodes an insecticidal protein of B.t. ontoinocidus havmn the 
sequence: y 

20 


25 


30 


35 


40 


•*5 


1 ATGGAGGAGAACAACCAAAACCAATGCATTCCATACAACT -1C 

41 GCTTGAGTAACC'CAGAAGAGGTATTGCTTGATGGAGAACG SC 

81 CATTTCAACCGGTAACTCTTCCATCGACATCTCCTTGTCC 1C 

121 TTGGTCCAGTTTCTGGTCAGCAACTT.CGTGCCA JGTGGTG : -■ 

161 GGTTCCTTCTCGGACTAATTGACT7CG7C7GGGGTATCG7 2 0 

201 TGGTCCATCTCAATGGGATGCATTCC7GG7GCAAATTGAG 2 *1 - 

2 4 1 CAGT7GATCAACGAGAGGATCGC7GAGT7CGCCAGGAACC 2 if v. 

291 (-To^CATCGv-TAA l *TGoAAGGAT7GGGCAATAACT7CAA '2 


50 


55 

/<?/ 


1 HI 


m 


80 


fcf* a 3H5 962 A1 


321 CA7CTA7G7GGAGGCC77CAAAGAG7GGGAAGAGGACCC7 3 60 
361 AAC AAC CC AGAG ACC CGC ACT AGGGTG ATC G AC AG ATTC A 400 

401 GAATCTTGGACGGCCTCTTGGAGAGAGATATCCCATCCTT 4 4 0 

* 

441 CAGAATCTCTGGCTTCGAAGTTCCTCTCTTGTCCGTGTAC 4 80 

481 GCTCAAGCAGCTAATCTTCACCTCGCTATCCTTCGAGACA 520 

521 G7G7CA7C777GGGGAAAGGTGGGGAT7GACCAC7A7CAA 5 60 

5 61 CG7CAA7GAGAA77ACAACAGAC77A7CAGGCACAT7GAC 600 

601 GAG7ACGCCGACCAC7GTGC7AACACC7ACAACCG7GGC7 54 0 

641 7GAACAA7C7CCC7AAG7CTAC77A7CAAGA77GGA77AC 550 

681 C7ACAACAGG77GAGGAGAGAC77GACGC7CACAG7777G 


GACA77GCAGC777C77CCCGAAC7A7GACAACAGGAGA7 


7 61 ACCC7A7CCAACCAG7GGG7CAAC77ACCAGAGAAG7C7A 500 
801 TACTGACCCAC77A7CAAC7TCAACCC7CAG77GCAAAG7 .540 
84 1 G7CGCCCAAC77CCCACA77CAACC7CA7CCAC7CCAGCC ^.40 


G7A7CAGGAACCCACAC77G777GACA7C77GAACAACC7 
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921' TACTATCTTCACCGATTGGTTCAGCGTTGGGCGTAACTTC 960 

5 * 

961 TATTGGGGTGGACACAGGGTCATCTCCTCTCTTATTGGAG 1000 

• • • 

1001 GTGGGAACATTACCTCTCCTATCTATGGACGTGAGGCAAA 104 0 

IQ 

1041 CCAGGAGCCACCACGTAGTTTCACCTTCAACGGTCCAGTC 1080 

rs 1081 TTCAGAACCTTGTCTAACCCTACCTTGAGATTGCTCCAGC 1120 

1121 AACC77GGCCAGC7CCACC777CAACC77AGAGG7G7TGA 1160 

20 ’ 

1161 GGGCGTTGAGTTCTCTACTCCTACCAACTCCTTCACTTAC 1200 

1201 AGAGGTAGAGGAACCGTTGATTCCTTGACCGAACTCCCAC 12 40 

.’5 

12 41 CAGAGGACAATAGCGTGCCACCCAGGGAAGGCTACTCCCA 12 0!' 

30 129 1 caggttgtgccacgcaaccttcgtgcagcgtt:cggaact 

1321 CCATTCCTCACTACAGGAGTTGTGTTCT'TATGGACTGATC 1 3 4 

15 • 

1361 GTAGTGC7ACTCTCACTAATACCATTGATCCCGAGAGGAT 1400 

1401 CAA7CAAATCCCA7TGG7CAAGGG777CCGTG7G7GGGGA 14 40 

L4 41 GvjAACTTC iGTCATCACAGGACCAGGCTTCACAGoAGGTG A $ 

* 1481 A7A7TCT7AGAAGAAACAC7T7TGGCGACT77G7GAGCC7 152: 




82 
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1521 

* • * « 

C C AAGTT AACATC AACTCTC CAATT ACTC AAAG AT AT C GT 

1560 

5 

1561 

» • • • 

CTCAGGTTTCGTTACGCATCTTCCCGTGACGCTAGAGTCA 

1600 

*0 

1601 

• • • • 

TCGTGCTCACCGGAGCAGCTTCTACCGGTGTCGGTGGACA 

1640 


1641 

AGTCTCCGTGAACATGCCACTCCAGAAGACTATGGAGATC 

1680 

1 5 

1691 

GGCGAGAACTTGACATCCAGGACC77CAGATACACCGAC7 

1720 


1721 

7CTC7AACCCTTTCAGTT7CCGTGCCAACCC7GACA7CA7 

L"C 

20 

1761 

TGGCATTAGCGAACAACCTCTCTTTGGAGCTGGTAGCATC 

1900 

25 

1801 

TCATCT^GCGAATTGTACATTGACAAGATTGAGATCATTC 

13 4 0 


1841 

77GCCGACGC7ACC77CGAGGC7GAG7C7GACC77GAGAG 

L38C 

to 

18 31 

\GCCCAGAAGGCTG7GAACGCCC7C777AC-;7CC7<:7AA7 

: ;■: : 


1921 

CAGA77GCC77GA.\AAC7GACG77AC7 2A-;7A7CA«:A77G 

• 


1961 

ACCAAG7G7CCAAC77GG7CGAC7GCC77AGCGA7GAG77 

:<) co 

w 

2001 

C7GCCTCGACGAGAAGCG7GAAC7C7CCGAGAAAG77AAA 

*» ■ / •* 'j 


204 1 

CACGCCAAGCG7C7CAGCGACGAGAGGAA7C7C77G.:AAG 



2081 

ACCCCAAC77CAGAGGCA7CAACAGGCAGv7CAGA<: IGTGG 



/97 


83 





84 



- 2 7 21 CGTT7ACAAGGAGGCCAAAGAGTCCGTGGA7GCT77G77C 27 60 

• • * . 

s 

2761 GTGAACTCCCAATATGATAGGTTGCAAGTGG ACACCAACA 2800 

• * * i 

2801 TCGCCATGATCCACGCTGCAGACAAACGTGTGCACAGGA7 284 0 

10 

2841 TCGTGAGGCTTACTTGCCTGAGTTGTCCGTG ATCCCTGGT 2880 

,s 2881 GTGAACGCTGCCATCTTCGAGGAACTTGAGGGACGTATCT 2 920 

2 921 77ACCGCA7AC7CC77G7ACGA7GCCAGAAACG7CATCAA 2 960 

20 

2 961 GAACGGTGACTTCAACAATGGCCTCTTGTGCTGGAATGTG 3000 

3001 AAAGGTCATGTGGACGTGGAGGAACAGAACAATCACCGTT 304 0 

:s 

3041 CCG7CC7GG77A7CCC7GAG7GGGAAGC7GAAG7G7CCCA 308 0 

.'o 3081 AGAGG77AGAG7C7G7CCA0G7AGACGC7ACA7777CCG7 ? 12 : 

31^1 'j7GACCGCT7ACAj s hjoAo j'jA7ACo'j7 jA . . G-.C.J* jA j 

J5 .... 

3161 CCA7CCACGAGA7CGAGGACAACACCGACGAGC77AAG77 3 200 

3 201 C7CCAAC7GCG7CGAGGAAGAAG7C7A' T ’CCCAACAAC ACC 3 2-10 

10 

32 4 1 G77AC77GCAACAAC7ACAC7GGGACCCAGGAAGAG7ACG -GSt' 

JS 3281 A A G G 7 A C C 7 A C AC 7 A G C C G 7 AA C C A A - J G 7 7 A C G A G G A A G C 3 3 2 2 


S5 

/<?? 


85 
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3321 TTACGGAAACAATCCTTCCGTTCCTGCTGACTATGCCTCC 3360 

* 

3361 GTGTACGAGGAGAAATCCTACACAGATGGCAGACGTGAGA 3400 

* 

3401 ACCCTTGCGAGTCCAACAGAGGTTACGGTGACTACACACC 34 40 

34 41 ACTTCCAGCAGGCTATGTTACCAAGGACCTTGAGTACTTT 34 80 

3481 CCTGAGACCGACAAAGTGTGGATCGAGATCGGTGAAACCG 3 5 2C 

35 21 AGGGAACCTTCATCG7GGACAGCGTGGAGC7TC7CT7GAT 3 5 60 

3561 GGAGGAA 3567. 

24. A Structural gone of Claim 13 which encodes a P2 insecticidal protein having ihe sequence 

1 A7GGACAACAACGTCT7GAAC7C7GG7AGAACAACCA7CT 4 C 

4» ACGv. ATACAACo . ^ j 7 juCTCAC 3A7C ~ ATTC A-oOTT 4 ' 

81 CGAACACAAGAGCCTCGACACTAT7CAGAAGGAG7GGA7C :c: 

121 GAATGGAAACGTACTGACCAC7C7C7C7ACG7CCCACC7G 160 

161 TGGTTGGAACAGTGTCCAGC77CC77C7CAAG a .AGG7CCG 2 00 

*- 0 - C . C i C7C ATCGGAAAAC 'TATCTTGTCCGAACTCTG' .3 T .1 < ' 


EP 0 385 962 A1 


*41 ATCATCTTTCCATCTGGGTCCACTAATCTCATGCAAGACA 280 

* * • • 

5 281 TCTTGAGGGAGACCGAACAGTTTCTCAACCAGCGTCTCAA 320 

321 CACTGATACCTTGGCTAGAGTCAACGCTGAGTTGATCGGT 350 

IQ * 

361 CTCCAAGCAAACATTCGTGAGTTCAACCAGCAAGTGGACA 400 

401 ACTTCTTGAATCCAACTCAGAA7CCTGTGCCTCTTTCCAT 440 

is 

441 CACTTCTTCCGTGAACAC7ATGCAGCAACTCTTCC7CAAC 480 

?o 481 AGATTGCCTCAGTTTCAGATTCAAGGCTACCAGT7GCTCC 520 

521 TTCTTCCACTCTTTGCTCAGGCTGCCAACATGCACTTGTC 560 

a ‘ 

561 CTTCATACGTGACGTGA7CCTCAACGCTGACGAATGGGGA 600 

601 ATC7CTGCAGCCAC7C7TAGGACATACAGAGACTAC77GA -4 

.10 

6 4 1 ■ j 'j AA lTACACTCG * j AT I* Av_ To CAAv_TAiTGv_A*\^AAGAG ^ : 

15 68 1 TTATCAGACTGCCTTTCGTGGACTCAATACTAGGCTTCAC '20 

721 GACATGCTTGAGTTCAGGACCTACATGTTCCTTAACGTGT '60 

41 ) • 

761 TTGAGTACGTCAGCATTTGCAG7CTCTTCAAG7ACCAGAG 4,v 

801 CTTGATGGTGTCCTCTGGAGCCAATCTCTACGGGTCTGG-: 6 4 ' 


1 

ar 
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to 


is 


20 


2 5 


KJ 


n 


a 4 1 AG7GGACCACAGC AAAC7C AGAGC 77 CACAGCTCAGAAC 7 880 

* 

881 ggccattcttgtatagcttgttccaagtcaactccaacta 920 

921 CATTCTCAGTGGTATCTCTGGGACCAGACTCTCCATAACC 960 

« 

1 • 9 

961 tttcccaacattggtggacttccaggctccactacaaccc 1000 

1001 ATAGCCTTAACTCTGCCAGAGTGAACTACAGTGGAGGTGT 104 0 

1041 CAGC7C7GGA77GA77GG7GCAAC7AACTTGAACCACAAC 1030 

1081 TTCAATTGCTCCACCGTCTTGCCACCTCTGAGCACACCGT 1120 

1121 T7G7GAGGTCC7GGC77GACAGCGGTAC7GA7CGCGAAGG 1160 

1151 AG77GCTACC7CTACAAACTGGCAAACCGAGTCC77CCAA IMO 

i.201 ACCACTC77AGCC77CGG7070GAGC777CTC7GCAC:J71 12 ; ' 

*^41 AA»7CAAAC . AC7T7CC AGAG7AC77 2A 77AGGAA> 1A T 1 2 - 

1281 C7C7GG7GTTCCTC7CG7CA7CAGGAA7GAAGACC7CACC 1322 

1321 CGTCCAC77CA77ACAACCAGA77AGGAACATCGAG7C7C 1J6C 

^ ^ ® ^ GA7CCGvj7ACTCCAGGAGG7GCAAGAGC77ACC7CG7G7C 14 2 0 

1401 7GTCCATAACAGGAAGAACAACA727AC22:7r.GCAAG.;AG 1.1 ' 



38 


cr U 009 904 AI 



1441 

* • , 

AATGGCACCATGATTCACCTTGCACCAGAAGATTACACTG 

1480 

5 

1481 

• • • • 

GATTCACCATCTCTCCAATCCATGCTACCCAAGTGAACAA 

1520 


1521 

• • • . 

• 

TCAGACACGCACCTTCATCTCCGAAAAGTTCGGAAATCAA 

1560 

10 

1561 

* • • • 

GGTGACTCCTTGAGGTTCGAGCAATCCAACACTACCGCTA 

1600 

1 5 

1601 

GGTACAC7TTGAGAGGCAATGGAAACAGCTACAACCTTTA 

16.40 


1641 

CTTGAGAGTTAGCTCCATTGGTAACTCCACCATCCGTGTT 

1680 

20 

1681 

ACCATCAACGGACGTGTTTACACAGTCTCTAATGTGAACA 

1720 


1721 

CTACAACGAACAATGATGGCGTTAACGACAACGGAGCCAG 

1760 

25 

1761 

ATTCAGCGACA7CAACATTGGCAACATCC7GCCC7C7GAC 

1900 

JO 

1801 

AACACTAACG77ACTTTGGACA7CAA7GTGACCC7CAA77 

1 3 4 C 


194 1 

CTGGAAC'ILCA. T7 3A7C7CA7GAACA7CA70777G70CC 

16 90 

15 

1881 

AACTAACCTCCCTCCATTGTAC7AA 1905. 



25 A plant transformation vector compristng a plant gene containing a structural gone ot Claim 13 

26 A structural gene sequence ol Claim 13 encoding a lusion protein comprising the N-tmmmal 6tO 
ommo acids ol Q.t.k. HO-1 and the C-termmal 567 ammo acids of 8.t k HD-7?, said gon- having ttio 
sequence: 


r ,o 


55 


89 
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.1 . ATGGACAACAACCCAAACATCAACGAATGCATTCCATACA 4 0 
• . , 

41 ACTGCTTGAGTAACCCAGAAGTTGAAGTACTTGGTGGAGA 80 

81 ACGCATTGAAACCGGTTACACTCCCATCGACATCTCC'ITG 120 

121 TCCTTGACACAGTTTCTGCTCAGCGAGTTCGTGCCAGGTG 160 
• • , 

161 CTGGGTTCGTTCTCGGACTAGTTGACATCATCTGGGGTAT 200 
201 CTTTGGTCCATCTCAATGGGATGCATTCCTGGTGCAAAT7 ZAO 

2 41 G AGC AGTTGATCAACCAGAGGATCG AAGAGTTCGCCAGGA 230 
281 ACCAGGCCATCTCTAGGTTGGAAGGATTGAGCAATCTCTA 

321 CCAAATCTATGCAGAGAGCTTCAGAGAGTGGGAAGCCGA7 3 60 

3 61 CCTACTAACCCAGCTCTCCGCGAGGAAATGCGTATTCAAT 4 00 
401 TCAACGACATGAACAGCGCCTTGACCACAGC7A7CCCA77 a A O 



90 
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-441 GTTCGCAGTCCAGAACTACCAAGTTCCTCTCTTGTCCGTG 480 

• • • . 

5 481 TACGTTCAAGCAGCTAATCTTCACCTCAGCGTGCTTCGAG 520 

• • 

521 ACGTTAGCGTGTTTGGGCAAAGGTGGGGATTCGATGCTGC 560 

10 «... 

561 AACCATCAATAGCCGTTACAACGACCTTACTAGGCTGATT 600 

;j 601 GGAAACTACACCGACCACGCTGTTCGTTGGTACAACACTG 640 

641 GCTTGGAGCGTG7CTGGGGTCCTGATTCTAGAGATTGGAT 680 

10 681 TAGATACAACCAGTTCAGGAGAGAATTGACCCTCACAGTT 720 

721 TTGGACATTGTGTCTCTCTTCCCGAACTATGACTCCAGAA 760 

25 . 

761 CCTACCCTATCCGTACAGTGTCCCAACTTACCAGAGAAAT 300 

]0 301 C7A7AC7AACCCAG77C77GAGAAC77CGACGG7AGC77C 3-JO 

34L CG7GGTTCTGCCCAAGGTATCGAAGGCTCCA7CAGGAGCC 330 

,5 381 CACACTTGATGGACA7CTTGAACAGCATAACTA7C7ACAC 920 

921 CGATGCTCACAGAGGAGAGTATTAC7GGTC7GGACACCAG 960 

40 

961 A7CA7GGCC7C7CCAG77GGA77CAGCGGGCCCGAG777A LOCO 

J5 1001 CC7T7CC7C7C7A7GGAAC7ATGGGAAACGCCCC7CCACA 1040 


91 
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1041 ACAACGTATCGTTGCTCAACTAGGTCAGGGTGTCTACAGA 

1081 accttgtcttccaccttgtacagaagacccttcaatatcg 
*• 

1121 GTATCAACAACCAGCAACTTTCCGTTCTTGACGGAACAGA 

1161 gttcgcctatggaacctcttctaacttgccatccgctgtt 

1201 TACAGAAAG AGCGGAACCGTTGATTCCTTGG ACGAAATCC 

12 41 CACCACAGAACAACAATGTGCCACCCAGGCAAGGAT7C7C 
1281 CCACAGGTTGAGCCACGTGTCCATGTTCCGTTCCGGATTC 
1321 AGC AAC AGTTCCGTGAGCATCATCAGAGCTCCTATGTTCT 

13 61 CATGGATTCATCGTAGTGCTGAGTTCAACAA7A7CAT7CC 
L 4 0 1 TTCCTCTCAAATCACCCAAA7CCCA770AC7AAG7C7AC 7 

14 4 1 AACCTTGGATCTGGAACTTCTGTCCGOAAAGGACGAGGCT 

14 81 7CACAGGAGGTGATATTC7TAGAAGAAC77C7CC7GGCCA 
1521 GATTAGC ACCCTCAGAGTTAACA7CACTGCACCAC777C7 

15 61 C AAAGAT ATCGTGTCAGGATTCGTTACGCATCTACCAG7A 
1601 ACTTCCAATTCCACACCTCCA7CGACGGAAGGCCTAT7AA 



1080 

1120 

1160 

1200 

1240 

1230 

1320 

1360 

14 00 


1520 


1560 


1 6 0 0 


1 -i 10 
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5 


to 


1 5 


20 


2 5 


JO 


. J5 


JO 


J5 


1641 TCAGGGTAACTTCTCCGCAACCATGTCAAGCGGCAGCAAC 
• • • • 

1681 TTGCAATCCGGCAGCTTCAGAACCGTCGGTTTCACTACTC 
• • • • 

1721 CTTTCAACTTCTCTAACGG ATCAAGCGTTTTCACCCTTAG 
• * * | 

1761 CGCTCATGTGTTCAATTCTGGCAATGAAGTGTACATTGAC 
* • • • 

1801 CGTATTGAGTTTGTGCCTGCCGAAGTTACCCTCGAGGCTG 

1841 AGTACAACCTTGAGAGAGCCCAGAAGGCTGTGAACGC CCT 

1881 CTTTACCTCCACCAATCAGCTTGGCTTGAAAACTAACGTT 

1921 ACTGACTATCACATTGACCAAGTGTCCAACTTGGTCACCT 

1961 ACCTTAGCGATGAGTTCTGCCTCGACGAGAAGCGTGAACT 

2001 CTCCGAGAAAGTTAAACACGCCAAGCGTCTCAGCGACGAG 

2041 AGGAATCTCTTGCAAGACTCCAACTTCAAAGACATCAACA 

2081 GGCAGCCAGAACGTGGTTGGGGTGGAAGCACCGGGATCAC 
2121 C ATCCAAGGAGGCGACGATGTGTTCAACGAGAACTACGTC 

2161 ACCCTCTCCGGAACTTTCGACGAGTGCTACCCTACCTACT 

2201 TGTACCAGAAGATCGATGAGTCCAAACTCAAAGCCTTCAC 




1680 

1720 

1760 

1800 

1840 

1880 

1920 

1960 

2000 

2040 

2080 
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2160 

2200 

2240 
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2241 CAGGTATCAACTTAGAGGCTACATCGAAGACAGCCAAGAC 2280 

2281 CTTGAAATCTACTCGATCAGGTACAATGCCAAGCACGAGA 2320 

* # - 

9 • 

2321 CCGTGAATGTCCCAGGTACTGGTTCCCTCTGGCCACTTTC 2360 
* • • 

2361 TGCCCAATCTCCCATTGGGAAGTGTGGAGAGCCTAACAGA 2 400 

2401 TGCGCTCCACACCTTGAGTGGAATCCTGACTTGGACTGCT 24 40 

24 41 CCTGCAGGG ATGGCG AGAAGTGTGCCC ACCATTCTC ATCA 24 80 

2481 CTTCTCCTTGGACATCGATGTGGGATGTACTGACCTGAAT 2 520 

2521 GAGGACCTCGGAGTCTGGGTCATCTTCAAGATCAAGACCC 2560 
2561 AAGACGGACACGCAAGACT7GGCAACCTTGAGTT7CTCGA 2 600 
2 601 AGAGAAACCATTGGTCGGTGAAGCTCTCGCTCGTGTGAAG 2 64 0 
2 641 AGAGCAGAGAAGAAG7GGAGGGACAAACG7GAGAAAC7CG 2 H I 
2 681 AATGGGAAACTAACATCGTTTACAAGGAGGCCAAAGAGTC 2 7 20 
2721 CGTGGATGCTTTGTTCGTGAACTCCCAATA7GATCAGTTG 2 7 80 
27 61 CAAGCCGACACCAACATCGCCATGA7CCACGCGGCAGACA 2 6 00 
2801 AACGTGTGCACAGCA7TCG?GAGGC77AC77GGCTGAGT7 2 84 0 



2841 G7CCG7GA7CCC7GG7G7GAACGC7GCCA7C77CGAGGAA 2880 

• * • • 

2881 CTTGAGGGACGTATCTTTACCGCATTCTCCTTGTACGATG 2920 

• • • • 

2921 CCAGAAACGTCATCAAGAACGGTGACTTCAACAATGGCCT 2960 

2961 CAGCTGCTGGAATGTGAAAGGTCATGTGGACGTGGAGGAA 3000 

• • • • 

3001 - CAGAACAATCAGCGTTCCGTCCTGGTTGTGCCTGAGTGGG 3040 

3041 AAGCTGAAGT' jTCCCAAGAGGTTAGAGTCTGTCCAGGTAG 3080 

3081 AGGCTACATTCTCCGTGTGACCGCTTACAAGGAGGGATAC 3120 

3121 GGTGAGGGTTGCGTGACCATCCACGAGATCGAGAACAACA 3160 

3161 CCGACGAGC77AAG7TC7CCAAC7GCGTCGAGGAAGAAAT 3200 

3201 C7ATCCCAACAACACCGT7AC7TGCAACCACTACAC7G73 324? 

3241 AATCAGGAAGAGTACGGAGG7GCC7ACACTAGCC3TAACA 3230 

3281 GAGG77ACAACGAAGC7CCTTCCGTTCCTGCTGACTA7GC 3320 

3321 C7CCG7G7ACGAGGAGAAATCCTACACAGA7GGCAGACGT 3360 

33 61 GAGAACCCTTGCGAGTTCAACAGAGGTT AC AGGGACT AC A 3 4 00 

3401 CACCACT7CCAGTTGGCTATGT7ACCAAGGAGCT7GAG7A 3440 

3441 C77TCC7GAGACCGACAAAG7GTGGATCGAGATCGGTGAA 3480 

3481 ACCGAGGGAACCTTCATCGTGGACAGCGTGGAGCTTCTCT 3520 

3521 TGATGGAGGAA '3531. 
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27. A method of Claim 4 furthor comprising removal of sequences comprising more than five 
consecutivo A T or G ♦ C bases. 

28. A structural gene sequence of Claim 13 comprising a majority of plant preferred codons. 

29. A structural gene encoding the coat protein of potato leaf roll virus, said gene having the sequence: 
s 

1 ATGAG7AC7G7CG7GG77AAGGGAAACG7GAACGG7GG7G 40 

l0 41 TTCAACAACCTAGAAGGAGAAGAAGGCAA7CCCTTCGTAG 80 

81 GAGAGCTAACAGAGTTCAGCCAGTGGTTATGGTCACTGCT 120 

,S 121 CCTGGGCAACCAAGAAGGAGAAGAAGGAGAAGAGGAGGTA 150 

161 ATCGCAGATCAAGAAGAACTGGAGTTCCCAGAGG/’vAGAGG 200 

20 

201 TTCAAGCGAGACATTCGTGTTTACAAAGGACAACCTCGTG 240 

, 5 241 GGCAACTCCCAAGGAAGTTTCACCTTCGGACCAAGTGTTT 280 

281 CAGAC7G7CCAGCA77CAAGGA7GGAATAC7C AAGGCTTA 3 20 

JO 

3 21 CCATGAGTACAAGATCACAAGTATCTTGCTTCAGTTCG7C 3 60 

3 61 AGCGAGGCC7C77CCACC7C7CCAGGC7CCA7CGCTTA7G 400 

4 01 AG77AGA7CCACA77GCAAAG777CA7CCC7CCAG7CC7A 4 40 

jo 

441 CG7CAACAAG77CCAAA7CACAAAGGG7GG7GCTAAGACC 4 80 

J5 4 81 7A7CAAGC7CG7A7GA7CAACGGAG77GAA7GGCACGAT7 520 

521 C77C7GAGGA7CAG7GCAGAA7CC777GGAAAGGAAATGG 5 60 

00 * 

5 61 AAAGTCTTCAGATCCAGCTGGA7CTTTCAGAG77ACCATC 600 

601 AGAGTTGCTCTTCAAAACCCAAAG 624. 

'SS 

30 A chimeric plant gene which comprises a structural coding sequence encoding an insecticidal 
protein of Bacillus thuringiensis , said structural coding sequence being modified to reduce the number of 
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putative polyadenylation signals within said structural coding sequence. 

31. A chimeric plant gene of Claim 30 in which the polyadenylation signals are selected from the group 
consisting of AATAAA. AATAAT, AACCAA. ATATAA. AATCAA. ATACTA, ATAAAA. ATGAAA, AAGCAT, 
ATTAAT, ATACAT. AAAATA, ATT AAA, AATTAA, AATACA and CAT AAA. 

5 32. A chimeric plant gene of Claim 31 in which said structural coding sequence is further modified to 

reduce the number of ATTTA sequences within said structural coding sequence. 

33. A chimeric plant gene of Claim 32 in which said structural coding sequence is substantially devoid 

•olyadenylation signals and ATTTA sequences. 

34. A transformed plant cell containing a gene of Claim 33. 

ro 35. A transformed plant cell of Claim 34 selected from the group consisting of soybean, cotton, alfalfa, 
oilseed rape, flax, tomato, sugarbeet. sunflower, potato, tobacco, maize, rice and wheat. 

36. A plant comprising transformed plant cells of Claim 34. 

37. A olant of Claim 36 which comprises plant cells of Claim 35. 

38. A seed produced by a plant of Claim 36. 
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FIGURE 1(B) 
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. 1 AT G GC T A T A ~ A A A CT GGTT AC. 1 . C CAA'!\ GATATTTCCT 40 
41 TGTCGCTAACGCAATTTCTTTTGAGTGAATTTGTTCCCGG 80 
81 TGCTGGATTTGTGTTAGGACTAGTTGATATAATATGGGGA 120 

T C 

121 ATTTTTGGTCCCTCTCAATGGGACGCATTTCTTGTACAAA 160 

161 TTGAACAGTTAATTAACCAAAGAATAGAAGAATTCGCTAG 200 
c C C G c G 

201 GAACCAAGCCATTTCTAGATTAGAAGGACTAAGCAATCTT 2 4 0 

T 

241 TATCAAATTTACGCAGAATCTTTTAGAGAGTGGGAAGCAG 2 80 
281 ATCCTACTAATCCAGCATTAAGAGAAGAGATGCGTATTCA 3 20 
321 ATTCAATGACATGAACAGTGCCCTTACAACCGCTATTCCT 3 60 

3 61 CTTTTT GC AGTTC AAAATT ATCAAGTTCCTCTTTTATC AG 400 

CC C C 

4 01 TATATGTTCAAGC7GCAAATTTACATTTATCAGTTTTGAG 4 4 n 

G C c CC c cc c 

44 1 AGA7GTTTCAGTG77TGGACAAAGGTGGGGATTTGATGCC 430 
481 GCGACTATCAATAGTCGTTATAATGATTTAACTAGGC77A 520 
521 TTGGCAACTATAC AGATC ATGCTGTACGCTGGTAC AA7 AC 5 60 

5 61 GGGATTAGAGCGTGTATGGGGACCGGATTCTAGAGATTGG 600 

601 A7AAGATATAA7CAATTTAGAAGAGAATTAACACTAACTG 64 0 
CGCCGC GC T 

64 1 TATTAGATATCGTTTC7CTATTTCCGAACTATGATAG7AG 630 
681 AACG7ATCCAATTCGAACAGTTTCCCAATTAACAAGAGAA 

FIGURE 2A 
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721 ATTTATACAAACCCAGTATTAGAAAATTTTGATGGTAGTT 760 

# • 

761 TTCGAGGCTCGGCTCAGGGCATAGAAGGAAGTATTAGGAG 800 

* • • • 

801 TCCACATTTGATGGATATACTTAATAGTATAACCATCTAT 840 

• • • 

841 ACGGATGCTCATAGAGGAGAATATTATTGGTC AGGGCATC 880 

C C C T C 

881 aaatAatggcttctcctgtagggttttcggggccagaatt 920 

G C 

* • • 

921 CACTTTTCCGCTATATGGAACTATGGGAAATGCAGCTCCA 960 

961 caacaacGtattgttgctcaactaggtcagggcgtgtata 1000 

1001 GAACATTATCGTCC ACCTT ATATAG AAG ACCTTTTAAT AT 104 0 

C 

1041 AGGGATAAATAATCAACAACTATCTGTTCTTGACGGGACA 1090 

c c c c 

• • • 

1081 GAATTTGCTTATGGAACCTCCTCAAA7TTGCCATCCGCTG 1120 

1121 TATACAGAAAAAGCGGAACGGTAGA7TCGCTGGATGAAAT : ! 60 

1161 ACCGCCACAGAATAACAACG7GCCACC7AGGCAAGGAT77 12 00 

1201 AGTCATCGATTAAGCCATG777CAA7GTTTCGTTCAGGCT 12;: 

12 41 TTAGTAATAGTAG7GTAAGTATAATAAGAGC7CC7A7G7T 12.30 

1281 CTC7TGGATACATCGTAGTGCTG AA7TTAATAATATAATT 1320 

G C C C C C 

13 21 CCTTCATCACAAATTACACAAATACCTTTAACAAAATCTA 13 60 

C C C AC C C G 

1361 CTAATCT7GGCTCTGGAAC77C7G7CG77AAAGGACCAGG 1400 

FIGURE 2B 
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14 G1 ATTTACAGGAGGAGATATTCTTCGAAGAACTTCACCTGGC 

1441 CAGATTTCAACCTTAAGAGTAAATATTACTGC ACCATTAT 
• • • 

1481 CACAAAGATATCGGGTAAGAATTCGCTACGCTTCTACCAC 

* • • • 

1521 AAATTTACAATTCCATAC ATCAATTGACGGAAGACCTATT 
CC T G C 

1561 AATCAGGGGAATTTTTCAGCAACTATGAGTAGTGGG AGTA 
1601 ATTTACAGTCCGGAAGCTTTAGGACTGTAGGTTTTACTAC 
1641 TCCGTTTAACTTTTCAAATGGATCAAGTGTATTTACGTTA 
1681 AGTGCTCATGTCTTCAATTCAGGCAATGAAGTTTATATAG 
1721 ATCGAATTGAATTTGTTCCGGCA 1743 

riGom: 2c 
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1 

• • • • 

ATGGATAACAATCCGAACATCAATGAATGCATTCCTTATA 
CCA C AC 

40 

41 

• * • • 

ATTGTTTAAGTAACCCTGAAGTAGAAGTATTAGGTGGAGA 

C C G A T C T 

80 

81 

* * * • 

AAGAATAGAAACTGGTTACACCCCAATCGATATTTCCTTG 
CCT C TC CC 

120 

121 

• • • • 

TCGCTAACGCAATTTCTTTTGAGTGAATTTGTTCCCGGTG 
CTGAG GCCCGCGA 

160 

161 

CTGGATTTGTGTTAGGACTAGTTGATATAATATGGGGAAT 
GCTCC CCC T 

200 

201 

TTTTGGTCCCTCTCAATGGGACGCATTTCTTGTACAAATT 

C A T C G G 

240 

241 

GAACAGTTAATTAACCAAAGAATAGAAGAATTCGCTAGGA 

G GC GGC G C 

280 

281 

ACCAAGCCATTTCTAGATTAGAAGGACTAAGCAATCTTTA 

G C G G T G C 

■320 

321 

TCAAATTTACGCAGAATCTTTTAGAGAGTGGGAAGCAGAT 

C C T GAGC C C 

3 60 

361 

CCTACTAATCCAGCATTAAGAGAAGAGATGCGTATTCAAT 

C TC CC C G A 

400 

401 

TCAATGACATGAACAGTGCCCTTACAACCGCTATTCCTC7 

C CTGCA CAT 

4 4 ' 

441 

TTTTGCAGTTCAAAATTATCAAGTTCCTCTT7TATCAGTA 
GC CGCC CGCG 

480 

481 

TATGTTCAAGCTGCAAATT7ACA7TTATCAGTT7TGAGAG 

C A T C T CC CAGC GC TC 

- •> 'X 

521 

ATGTTTCAGTGTTTGGACAAAGGTGGGGATTTGATGCCGC 

C AGC G C T 

5 60 

561 

GACTATCAATAGTCGTTATAATGATTTAACTAGGCTTATT 
AC C CCCCT G 

600 

601 

GGCAACTATACAGATcATGCTGTaCGCTGGTACAATACGG 

A C C CC C TT CT 

64 0 

641 

GATTAGAGCGTGTATGGGGACCGGATTCTAGAGATTGGAT 

C G C T T 

660 


FIGURE 3A 
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681 

AAGATATAATCAATTTAGAAGAGAATTAACACTAACTGTA 

T CCGCG GCCAT 

720 

721 

• • • • 

TTAGATATCGTTTCTCTATTTCCGAACTATGATAGTAGAA 

G C T G C C CTCC 

7 60 

761 

CGTATCCAATTCGAACAGTTTCCCAA7TAACAAGAGAAAT 
CCTCT G CTC 

800 

801 

• • • • 

TTATACAAACCCAGTATTAGAAAATTTTGATGGTAGTTTT 

C T TCTGCCC CC 

840 

841 

• • • • 

CGAGGCTCGGCTCAGGGCATAGAAGGAAGTATTAGGAGTC 

T T T C A T C CTCC C C 

880 

881 

CACATTTGATGGATATACTTAATAGTATAACCATCTATAC 

C CCTGCC T C 

920 

921 

GGATGCTCATAGAGGAGAATATTATTGGTCAGGGCATCAA 

C C GCTACG 

960 

961 

ATAATGGCTTCTCCTGTAGGGTTTTCGGGGCCAGAATTCA 

C C A T A CAGC C G T 

1000 

1001 

CTTTTCCGCTATATGGAACTATGGGAAATGCAGCTCCACA 
CTC C C 

1040 

1041 

ACAACGTATTGTTGCTCAACTAGGTCAGGGCGTGTATAGA 

C T C C 

1080 

1081 

acattatcgtccaccttatatagaagaccTtttaatatag 

CGT GC CC C 

1 • "> n 

4. * 4- V 

1121 

GGATAAATAATCAACAACTATCTGTTCTTGACGGGACAGA 
TCCCG TC A 

1160 

1161 

ATTTGCTTATGGAACCTCCTCAAATTTGCCATCCGCTG7A 

G C C T T C T 

1200 

1201 

TACAGAAAAAGCGGAACGGTAGATTCGCTGGATGAAATAC 

G C T CT C C 

12 4i' 

1241 

CGCCACAGAATAACAACGTGCCACCTAGGCAAGGATTTAG 

A C T C CTC 

1230 

1281 

TCATCGATTAAGCCATGTTTCAATGTTTCGTTCAGGCTTT 
CCAGG CGC C CAC 

1320 

1321 

AGTAATAGTAGTGTAAGTATAATAAGAGCTCCTATGTTCT 

C C TCC G C C C 

13 60 

1361 

CTTGGATACATCGTAGTGCTGAATTTAATAATATAATTCC 
AT G C C C 

1400 
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* • • • • 

1401 TTCATCACAAATTACACAAATACCTTTAACAAAATCTACT 14 40 

CT CC CAGCG 

• • * * 

1441 AATCTTGGCTCTGGAACTTCTGTCGTTAAAGGACCAGGAT 1480 

C A G C 

t * • * 

1481 TTACAGGAGGAGATATTCTTCGAAGAACTTCACCTGGCCA 1520 

C T A T 

• • • • 

1521 GATTTCAACCTTAAGAGTAAATATTACTGCACCATTATCA 1560 

AGCCC TCC CTT 

• • * * 

1561 CAAAGATATCGGGTAAGAATTCGCYACGCTTCTACCACAA 1600 

T C G T A A 

1601 ATTTACAATTCCATACATCAATTGACGGAAGACCTATTAA 164C 

C G ’ CCCC G C 

1641 TCAGGGG AATTTTTCAGCAACT ATG AGT AGTGGG AGT AAT 1680 

T C C C C TCA CCCC 

1681 TTACAGTCCGGAAGCTTTAGGACTGTAGGTTTTACTACTC 1720 

GA C CACC C 

1721 CGTTTAACTTTTCAAATGGATCAAGTGTATTTACGTTAAG 17 60 

TC CTC CTCCCT 

17 61 TGCTCATGTCTTC AATTCAGGCAATGAAGTTTATATAGAT 1 3 C 0 

C G T G C T C 

1801 CGAATTGAATTTGTTCCGGCAGAAGTAACCTTTGAGGCAG 13;0 

T G GTC T C T 

1841 AATAT 1845 

G C 


FIGURE 3C 
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1 

ATGGATAACAATCCGAACATCAATGAATGCATTCCTTATA 
CCA C AC 

40 

41 

• • * • • 

ATTGTTTAAGTAACCCTGAAGTAGAAGTATTAGGTGGAGA 

C C G A T C T 

80 

81 

• • * * 

AAGAATAGAAACTGGTTACACCCCAATCGATATTTCCT7G 
CCT C TC CC 

120 

121 

• • • • 

TC GC7 AAC GC AATTTCTTTTGAGTG AATTTGTTCC C GGTG 
CTGAG GCCCGCGA 

160 

161 

* * * * 

CTGGATTTGTGTTAGGACTAGTTGATATAATATGGGGAAT 
GCTCC CCC T 

200 

201 

TTTTGGTCCCTCTCAATGGGACGCATTTCTTGTACAAATT 

C A T C G G 

240 

241 

GAACAGTTAATTAACCAAAGAATAGAAGAATTCGCTAGGA 

G GC GGC G C 

280 

281 

ACCAAGCCATTTCTAGATTAGAAGGACTAAGCAATCTTTA 

G C G G T G C 

320 

321 

TC AAATTT ACGC AG AATCTTTT AG AG AG7GGG AAGC AG AT 

C C T GAGC C C 

360 

361 

CCTACTAATCCAGCATTAAGAGAAGAGATGCGTATTCAAT 

C TC CC C G A 

400 

401 

TCAATGACATGAACAGTGCCCTTACAACCGCTATTCCTCT 

C C T G C A C AT 

440 

44 1 

TTTTGCAGTTCAAAATTATCAAGTTCCTCTTTTATCAGTA 
GC CGCC CGCG 

480 

481 

TATGTTCAAGCTGCAAATTTACATTTATCAGTT77GAGAG 

C A T C T CC CAGC GC TC 

5 20 

521 

ATGTTTCAGTGTTTGGACAAAGGTGGGGATTTGATGCCGC 

C AGC G C T 

560 

561 

GACTATCAATAGTCGTTATAATGATTTAACTAGGCTTATT 
AC C CCCCT G 

600 

601 

GGCAACTATACAGATTATGC7G7ACGCTGGTACAATACGG 

A C C CC C TT CT 

6 4 0 

64 1 

GATTAGAACGTGTATGGGGACCGGA7TCTAGAGA77GGGT 

C G G C T T A 

660 
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681 AAGGTATAATCAATTTAGAAGAGAATTAACACTAACTGTA 7 20 

TACCGCG GCCAT 

• * • * 

721 TTAGATATCGTTGCTCTGTTCCCGAfTTATGATAGTAGAA 7 60 

G C T GT C C CTCC 

# * * • 

7 61 GATATCCAATTCGAACAGTTTCCCAATTAACAAGAGAAAT 800 

CCCTCT G CTC 

* « • • 

801 TTATACAAACCCAGTATTAGAAAATTTTGATGGTAGTTTT 340 

C T TCTGCCC CC 

841 CGAGGCTCGGCTCAGGGCATAGAAAGAAGTATTAGGAGTC 380 

TTTCATC G CTCC C C 

881 CACATTTGATGGATATACTTAACAGTATAACCATCTATAC 920 

C C CT G C T C 

921 GGATGCTCATAGGGGTTATTATTATTGGTCAGGGCATCAA 960 

C CAAGG C TACG 

961 ATAATGGCTTCTCCTGTAGGGTTTTCGGGGCCAGAATTCA . 1000 

C C A T A CAGC C G T 

1001 CTTTTCCGCT ATATGG AACTATGGG AAATGC AGCTCCAC A 10 4 0 

CTC C C 

1041 ACAACGTATTGTTGCTCAACTAGGTCAGGGCGTGTATAGA 1260 

C T C C 

1081 ACATTATCGTCCACTTTATATAGAAGACCTTTTAATATAG 110 0 

CGT CGC CC C 

1121 GG ATAAAT AATCAAC AACTATCTGTTCTTGACGGG ACAG A 1 16 0 

TCCC G TC A 

1161 ATTTGCTTATGGAACCTCCTCAAATTTGCCATCCGCTGTA 1200 

G C C T T C 7 

1201 TACAGAAAAAGCGGAACGGTAGATTCGCTGGATGAAATAC 12 4 0 

G C T CT C C 

1241 CGCCACAGAATAACAACGTGCCACCTAGGCAAGGATTTAG 12 30 

A C T C CTC 

12 81 TCATCGATTAAGCCATGTTTCAATGTTTCGTTCAGGCTTT 13 20 

CCAGG CGC C CAC 

1321 AGTAATAGTAGTGTAAGTATAATAAGAGCTCCTATGTTCT 13 60 

C C TCC G C C C 

13 61 CTTGGATACATCGTAGTGCTGAATTTAATAAT AT.AATTGC 

C G C C C C C 
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1401 

A7CGGA7AG7A77AC7CAAA7CCC7GCAG7GAAGGGAAAC 

C 

1440 

1441 

TTTCTTTTTAATGGTTCTGTAATTTCAGGACCAGGATTTA 

C C C C C 

1480 

1481 

• ill 

CTGGTGGGGACTTAGTTAGATTAAA7AGTAGTGGAAATAA 
• A C C C C C C 

1520 

1521 

• • • • 

CATTCAGAATAGAGGGTATATTGAAGTTCCAATTCACTTC 

1560 

1561 

• • * • 

CCATCGACATCTACCAGATATCGAGTTCGTGTACGGTATG 

C A GA 

1600 

1601 

C77CTG7AACCCCGAT7CACC7CAACG77AA77GGGGTAA 

G 7 

1640 

1641 

77CA7CCA777777CCAA7ACAG7ACCAGC7ACAGC7ACG 

C C 7 C 

1630 

1681 

7CA77AGA7AA7C7ACAA7CAAG7GA7777GG77A7777G 

C G C C C C C 

1720 

1721 

AAAGTGCCAA7GC77TTACA7CT7CA77AGG7AATATAGT 

C c c C 

17 50 

1761 

AGGTG77AGAAA7T7TAG7GGGAC7GCAGGAGTGATAATA 

G C T C 

1300 

1801 

GACAGAT7TGAATT7AT7CCAG77AC7GCAACaC7CGAGG 

C G C 

134: 

1841 

CTGAATA7AATCTGGAAAGAGCGCAGAAGGCGG7GAATGC 

A TGCG 

^ 3 3 i j 

1881 

GC7G777ACGTC7ACAAACCAACTAGGGC7A\AAACAAAT 
CTG7 ACG7CTACA C AGC7 G ACTC G CA 7G 

' £ 7 ^ 

1921 

G 1921 
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• • • • 

1 GAAAGAATAGAAACTGG7TACACCCCAA7CGA7A777CC7 40 

ATGGCC T C T C C C 

* * • • 

41 TGTCGCTAACGCAATTTCTTTTGAGTGAATTTGTTCCCGG 80 

CTGAG GCCCGCGA 
* * * * 

81 TGCTGGATTTGTGTTAGGACTAGTTGATATAATATGGGGA 120 

GCTCC CCC T 

• « I • 

121 ATTTTTGGTCCCTCTCAATGGGACGCATTTCTTGTACAAA 160 

C A T C G G 

• * • • 

161 TTGAAC AGTTAATTAACCAAAGAATAGAAG AATTCGCT AG 200 

G G C GGC G C 

201 GAACCAAGCCA777C7AGA77AGAAGGAC7AAGCAA7C77 24C 

G C G G T G C 

241 7A7CAAA777ACGCAGAA7C7T77AGAGAG7GGGAAGCAG 29C 

C C T GAGC C C 

281 A7CC7AC7AA7CCAGCA77AAGAGAAGAGA7GCG7A77CA 320 

C 7C CC C G A 

321 A77CAA7GACA7GAACAG7GCCC77ACAACCGC7A77CC7 3 6 C 

C C7GCA CA 

3 61 C77777GCAG77CAAAA77A7CAAG77CC7C7777A7CAG ICC 

7GC CGCC CGC 

4 01 7A7A7G77CAAGC7GCAAA777ACA777A7CAGT777GAG 4 4 C 

G C A 7 C 7 CC CAGC GC 7C 

441 AGA7G777CAG7G777GGACAAAGG7GGGGA777GA7GCC 4 60 

C AGC G C 7 

481 GCGAC7A7CAA7AG7CG77A7AA7GAT~7AAC7AGGC77A 5C.' 

AC C C C CC 7 G 

521 7TGGCAAC7A7ACAGA77A7GC7G7ACGC7GG7ACAA7AC 3 60 

A CCCCC 77 C 

5 61 GGGA7TAGAACG7G7A7GGGGACCGGA77C7AGAGA77GG 60 C 

7 C G G C 7 7 

601 G7AAGG7A7AA7CAA777AGAAGAGAAT7AACAC7AAC7G 64: 

A7ACCGCG GCCA 

64 1 7A77AGA7A7CG77GC7C7G77CCCGAA77ATGA7AGTAG 6-: 

7 G C 7 G7 C C C7CC 
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681 AAGATATCCAA77CGAACAG777CCCAA77AACAAGAGAA 7 20 

CCCTCT G CTC 

* * * * 

721 ATTTATACAAACCCAGTATTAGAAAATTTTGATGGTAGTT 7 60 

C T TCTGCCC C 

• • • * 

761 TTCGAGGCTCGGCTCAGGGCATAGAAAGAAGTATTAGGAG 800 

CTTTCATC G CTCC C 

* * • * 

801 TCCACATTTGATGGATATACTTAACAGTAT AACCATCTAT 840 

C C CCTG C T C 

• * * * 

841 ACGGATGCTCATAGGGGTTATTATTATTGGTCAGGGCATC 880 

C CAAGG C TAC 

881 AAATAATGGCTTCTCCTGTAGGGTTTTCGGGGCCAGAATT 920 

G C C A T A CAGC C G 

921 CACTTTTCCGCTATATGGAACTATGGGAAATGCAGCTCCA 960 

T C T C C C 

961 CAACAACGTATTGTTGCTCAACTAGGTCAGGGCGTGTA7A 1000 

C T C C 

1001 GAACA77A7CG7CCAC777A7A7AGAAGACC7777AA7A7 1040 

C G T • CGC CC 

1041 AGGGA7AAA7AA7CAACAAC7A7C7G77C77GACGGGACA 1080 

CTCCCG TC A 

1081 GAA777GC77A7GGAACC7CC7CAAA7T7GCCA7CCGC7G 1120 

G C C 7 7 C 

1121 7A7ACAGAAAAAGCGGAACG37AGAT7C3CTGGATGAAAT 11. •--J 

7 G C 7 C7 C 

1161 ACCGCCACAGAA7AACAACG7GCCACC7AGGCAAGGAT77 12 00 

C A C 7 c >: 

1201 AG7CA7CGA77AAGCCATGT77CAA7G777CGT7CAGGC7 12 4,' 

7CC CAGG CGC C CA 

1241 77AG7AA7AG7AG7G7AAG7A7AA7AAGAGC7CC7A7G77 12 90 

C C C 7CC G C C C 

1281 C7C77GGA7ACA7CG7AG7GC7GAA7TTAATAATA7AA77 1320 

C G C C C C C 

1321 GCA7CGGA7AG7A77AC7C AAATCCC7GCAGTG AAGGGAA 13 60 


13 61 AC777C77777AA7GG77CTG7AA777CAGGACCAGGA77 

c c c c 
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1401 

TACTGGTGGGGACTTAGTTAGATTAAATAGTAGTGGAAAT 

C ACC CCCC 

1440 

1441 

* • • • 

AACATTCAGAATAGAGGGTATATTGAAGTTCCAATTCACT 

1480 

1481 

TCCCATCGACATCTACCAGATATCGAGTTCGTGTACGGTA 

C A GA 

1520 

1521 

TGCTTCTGTAACCCCGATTCACCTCAACGTTAATTGGGGT 

G T 

1560 

1561 

• • • • 

AATTCATCCATTTTTTCCAATACAGTACCAGCTACAGCTA 

C C T 

1600 

1601 

CGTCATTAGATAATCTACAATCAAGTGATTTTGGTTATTT 
CCG C CC C C 

1640 

1641 
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ACCAAGCCATTTCTAGATTAGAAGGACTAAGCAATCTTTA 

G C G G T G C 
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C C T GAGC C C 
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CCTACTAATCCAGCATTAAGAGAAGAGATGCGTATTCAAT 

C TC CC C G A 

4 00 

401 

TCAATGACATGAACAGTGCCCTTACAACCCCTATTCCTCT 

C CTGCA CAT 
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TT7TGCAGTTCAAAATTATCAAGTTCCTCTTTTATCAGTA 
GC CGCC C.GC G 

•» t 
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TATGTTCAAGCTGCAAATTTACATTTATCAGT7TT0AGAG 

C A T C T CC CAGC GC TC 

:■:: 
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ATGTTTCAGTGTTTGGACAAAGGTGGGGATTTGATGCCGC 
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.681 AAGGTATAATCAATTTAGAAGAGAATTAACACTAACTGTA 720 

TACCGCG GCCAT 

* * * * 

721 TTAGATATCGTTGCTCTGTTCCCGAATTATGATAGTAGAA 7 60 

G C T GT C C CTCC 

761 GATATCCAATTCGAACAGTTTCCCAATTAACAAGAGAAAT 800 

CCCTCT G CTC 

801 TTATACAAACCCAGTATTAGAAAATTTTGATGGTAGTTTT 840 

C T TCTGCCC CC 

• # • * 

841 CGAGGCTCGGCTCAGGGCATAGAAAG.JVGTATTAGGAGTC 880 

TTTCATC G CTCC C C 

* « • * 

881 CACATTTGATGGATATACTTAACAGTATAACCATCTATAC 920 

C C CT G C T C 

921 GGATGCTCATAGGGGTTATTATTATTGGTCAGGGCATCAA 960 

C CAAGG C TACG 

961 ATAATGGCTTCTCCTGTAGGG7TTTCGGGGCCAGAATTCA 1000 

C- C A T A CAGC C G T 

i • • * 

1001 CTTTTCCGCTATATGGAACTATGGGAAATGCAGCTCCACA 1040 

CTC C C 

1041 ACAACGTATTGTTGCTCAACTAG'JTCAGGCCGTGTATAGA 108 0 

C ICC 


1031 ACATTATCGTCCACTTTATArAGa VJACC7~T7AA7ATAG 11Z -0 

CGT CGC CC C 

1121 GGATAAATAATCaACAACTATCTGTTCTTGACGGGACAGA 

TCCCG TC A 

1161 ATTTGCTTATGGAACC7CC7CAAAT7TGCCA7CCGC7G7A 120 0 

G C C 7 7 C 7 

1201 TACAGAAAAAGCGGAACGGTAGATTCGCTGGATGAAATAC 1240 

G C T CT C C 

12 41 CGCCACAGAATAACAACG7GCCACCTAGGCAAGGATTTAG 129 0 

A C T C CTC 

1281 TCATCGATTAAGCCATG777CAATGTTTCGT7CAGGCTTT 1320 

C CA G G CGC C C A C 

1321 AGTAATAGTAGTG7AAG7A7AA7AAGAGC7CC7A7GT7CT 1360 

C C TC C G C C C 

1361 CTTGGATACATCG7AG7GC7GAATTTAATAA~A7AAT7GC 1400 

C G C C C C C 
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1401 
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1440 

1441 

* * • • 

TTTCTTTTTAATGGTTCTGTAATTTCAGGACCAGGATTTA 
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1481 ‘ 

• • • • 
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A C C C C C C 
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• • • • 
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C A GA 

1600 

1601 

CTTCTGTAACCCCGATTCACCTCAACGTTAATTGGGGTAA 

G T 

1640 

1641 
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TCATTAGATAATCTACAATCAAGTGATTTTGGTTATTTTG 

C G C C C C C 
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CTGAATATAATCTGGAAAGAGCGCAGAAGGCGGTGAATGC 

: i .-i ■: 

1-881 

GCTGTTTACGTCTACAAACCAACTAGGGCTAAAAACAAAT 
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GTAACGGATTATCATATTGATCAAGTGTCCAATTTAGTTA 
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CGTATTTATCGGATGAATTTTGTCTGGATGAAAAGCGAGA 

200 0 

2001 

ATTGTCCGAGAAAGTCAAACATGCGAAGCGACTCAGTGAT 

2 0 4 0 

2041 

GAACGCAATTTACTCCAAGATTCAAATTTCAAAGAGA77A 

: o 4 o 

2081 

ATAGGCAACCAGAACGTGGGTGGGGCGGAAGTACAGGGA7 

2 120 
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2-121 TACCATCC AAGGAGGGG ATGACGTATTTAAAGAAAATTAC 2160 

2161 GTCACACTATCAGGTACCTTTGATGAGTGCTATCCAACAT 2200 

• • • • 

2201 ATTTGTATCAAAAAATCGATGAATC AAAATTAAAAGCCTT 2240 

• • • • 

2241 TACCCGTTATC AATTAAGAGGGTATATCGAAGATAGTCAA 2280 

2281 GACTTAGAAATCTATTTAATTCGCTACAATGCAAAACATG 2320 

2321 AAACAGTAAATGTGCCAGGTAC.GGGTTCCTTATGGCCGCT 2 360 

23 61 TTCAGCCCAAAGTCCAATCGGAAAGTGTGGAGAGCCGAAT 2 400 

2 401 CGATGCGCGCC AC ACCTTGAATGGAATCCTGACTT AGATT 2 4 4 0 

2 4 4 1 GTTCGTGTAGGGATGGAGAAAAG7GTGCCC ATCATTCGCA 2 4 30 

2481 TCATTTCTCCTTAGACA77GATGTAGGA7G1’ACAGAC77A 2 52 0 

252 1 AA7GAGGACCTAGGTG7A7GGG7GA7C777AAGAT7AAGA 2 5 30 

2561 CGCAAGATGGGCACGCAAGAC7AGGGAA7C7AGAG777C7 2 300 

2 601 CGAAGAGAAACCA77AG7AGGAGAAGCGC7AGC7CGTG7G 2 3-10 

2 641 AAAAG AGCGG AGAAAAAA7GGAGAG AC AAACG7G AAAAA7 2 3 3 0 

2 681 7GGAATGGGAAACAAA7A7CG777ATAAAGAGGCAAAAGA 2 7 20 

2721 ATCTGTAGATGC777ATT7GTAAAC7C7C AA7ATG A7C AA 2 7 30 

2761 7TACAAGCGGA7ACGAA7A77GCCA7GA77CA7GCGGCAG 23'.': 

2801 ATAAACGTGT7CA7AGCA77CGAGAAGCT7ATCTC-CC7GA 2 3 10 
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• • • • 

2 841 GCTGTCTGTGATTCCGGGTGTC AATGCGGCTATTTTTGAA 

• • • • 

2881 GAATTAGAAGGGCGTATTTTCACTGCATTCTCCCTATATG 

• • • • 

2921 ATGCGAGAAATGTCATTAAAAATGGTGATTTTAATAATGG 

2961 CTTATCCTGCTGGAACGTGAAAGGGCATGTAG ATGT AGAA 

3001 GAACAAAACAACCAACGTTCGGTCCTTGTTGTTCCGG AAT 
3041 GGGAAGC AGAAGTGTCACAAGAAGTTCGTGTCTGTCCGGG 

3081 TCGTGGCTATATCCTTCGTGTCACAGCGTACAAGGAGGGA 

3121 TATGGAGAAGGTTGCGTAACC ATTCATGAGATCGAG AACA 

3161 ATACAGACGAACTGAAGTTTAGCAACTGCGTAGAAGAGGA 

3201 AATCT ATCC AAAT AAC ACGGT AACGTGTAATG ATT AT ACT 

3 2 4 1 GTAAATC AAGAAGAA7ACGGAGGTGCGTACACT7C7CG7A 

3281 ATCG AGG AT AT AACGAAGCTCCTTCCGTACCAGC7G ATT A 

3321 TGCGTCAGTCTATGAAGAAAAATCGTATACAGA7GGACGA 

3 3 61 AGAGAGAATCC7TGTGAATT7AACAGAGGG7A7AGGGATT 

3401 ACACGCCACTACCAGTTGGTTATGTGACAAAAG AATTAGA 

34 4 1 ATACTTCCCAG AAACCGATAAGGTATGGA77GAGATTGGA 

34 8 1 GAAACGGAAGGAACA77TATCG7GGACAGCG7GGAA77AC 

3521 TCCTTATGGAGGAA 3534 
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G C T GT C C CTCC 

• • • * 
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2801 ATAAACGTGTTCATAGCATTCGAGAAGCTTATCTGCCTGA 284 0 

• I , 

2841 GCTGTCTGTGATTCCGGGTGTCAATGCGGCTATTTTTGAA 2880 

2881 GAATTAGAAGGGCG7AT77TCAC7GCAT7CTCCCTA7A7G 2920 

C C 

2921 ATGCGAGAAATGTCATTAAAAATGGTGATTTTAATAATGG 2960 
C C. CGC CCC 

2961 CTTATCCTGCTGGAACGTGAAAGGGCATGTAGATGTAGAA 3000 

* “ • « 

3001 GAACAAAACAACCAACGTTCGGTCCTTGTTGTTCCGGAAT 304 0 

3041 GGGAAGCAGAAGTGTCACAAGAAGTTCGTGTCTGTCCGGG - 3080 

* • * . 

3081 7CG7GGC7A7A7CC77CG7G7CACAGCG7ACAAGGAGGGA 3120 

3121 TATGGAGAAoCTTGCGTAACCATTCATGAGATCGAGAACA 3160 

• • • . 

3161 ATACAGACGAACTGAAGTTTAGCAACTGCGTAGAAGAGGA 3200 

3201 AATCTATCCAAATAACACGGTAACGTGTAATGATTATACT 32 4: 

32 41 GTAAATCAAGAAGAATACGGAGGTGCGTACACT7CTCG7A 32 30 

3281 ATCGAGGATATAACGAAGCTCCT7CCG7ACCAGCTGAT7A 332 ; 

3321 7GCG7CAG7C7A7GAAGAAAAA7CG7A7ACAGA7GGACGA 3 3 0 

33 61 AGAGAGAA7CC77G7GAA777AACAGAGGG7A7AGGGA77 3 400 

3401 ACACGCCAC7ACCAG77GG77A7G7GACAAAAGAA77AGA 3440 

3441 A7AC77CCCAGAAACCGA7AAGG7A7GGA77GAGA77GGA 343: 

3481 GAAACGGAAGGAACA777A7CG7GGACAGCG7GGAA77AC 352; 

3521 7CC77A7GGAGGAA 3534 
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1 ATGGATAACAATCCGAACATCAATGAATGCATTCCTTATA 40 
CCA c Ac 

41 attgtttaagtaaccctgaagtagaagtattaggtggaga so 

C C G A T C T 

81 AAGAATAGAAACTGGTTACACCCCAATCGATATTTCCTTG 120 
CCT C TC CC 

• * * , 

121 TCGCTAACGCAATTTCTTTTGAGTGAATTTGTTCCCGGTG 160 
CTGAG GCCCGCGA 
' • 

161 CTGGATTTGTGTTAGGACTAGTTGATATAATATGGGG AAT- 200 
GCTCC CCC T 

201 TTTTGGTCCCTCTCAATGGGACGCATTTCTTGTACAAATT 2 40 
C A T C G G 

* * * * 

241 G AACAGTTAATTAACC AAAGAATAGAAGAATTCGCT AGG A 2 80 

G GC G G C G C 

281 ACCAAGCCATTTCTAGATTAGAAGGACTAAGCAATCTTTA 3 20 
G C G G T G C 

321 TCAAATTTACGCAGAATCTTTTAGAGAGTGGGAAGCAGAT 3 60 
C C T GAGC C C 

361 CCTACTAATCCAGCATTAAGAGAAGAGATGCGTATTCAAT 4 0C 
C TC CC C G' A 

401 TCAATGACATGAACAGTGCCCTTACAACCGCTATTCCTCT 4 40 
C CTGCA CAT 

4 41 TTTTGC AGTTC AAAATTATCAAGTTCCTCTTTTATC AGTA 4 30 
GC CGCC CGCG 

4 81 TATGTTCAAGCTGCAAATTTACATTTATCAGTTTTGAGAG 5 3 C 
C A T C T CC CAGC GC TC 

521 ATGTTTCAGTGTTTGGACAAAGGTGGGGATTTGATGCCGC 560 
C AGC G C T 

561 GACTATCAATAGTCGTTATAATGATTTAACTAGGCTTATT 600 
AC C CCCCT G 

601 GGCAACTATACAGATTATGCTGTACGCTGGTACAATACGG 640 
A CCCCC TT CT 

641 GATTAGAACGTGTATGGGGACCGGATTCTAGAGA7TGGGT 630 
C G G C T T A 
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681 AAGGTATAATCAATTTAGAAGAGAATTAACACTAACTGTA 
TACCGCG GCCAT 

721 TTAGAT ATCGTTGCTCTGTTCCCG AATTATGATAGT AGAA 
G C T GT C C CTCC 

761 GATATCCAATTCGAACAGTTTCCCAATTAACAAGAGAAAT 
CCCTCT G CTC 

• • * 

801 TTATAC AAACCCAGTATTAGAAAATTTTG ATGGT AGTTTT 
C T TCTGCCC CC 

841 CGAGGCTCGGCTCAGGGCATAGAAAGAAGTATTAGGAGTC 
TTTCATC G CTCC C C 
• * • . 

881 C ACATTTG ATGG ATATACTTAACAGTATAACCATCT AT AC 
C C CT G C T C 

• • • • 

921 GGATGCTCATAGGGGTTATTATTATTGGTCAGGGCATCAA 
. C CAAGG C TACG 

961 ATAATGGCTTCTCCTGTAGGGTTTTCGGGGCC AG AATTC A 
C C A T A CAGC C G T 

1001 CTTTTCCGCTATATGGAACTATGGGAAATGCAGC7CCACA 
CTC C C 

1041 ACAACGTATTGTTGCTC AACTAGGTCAGGGCGTGTAT AG A 
C T C C 

1081 ACATTATCGTCC ACTTTATATAGAAGACCTTTTAAT AT AG 
CGT G C C C C 

1121 GGATAAATAATCAACAACTATCTGTTCTTGACGGG ACAGA 
TCCCG TC A 

1161 ATTTGCTTATGGAACCTCCTCAAATTTGCCATCCGCTGTA 
G C C T T C T 

1201 TACAGAAAAAGCGGAACGGTAGATTCGCTGGATGAAATAC 
G C 7 CT C C 

1241 CGCCAC AGAATAAC AACGTGCCACCTAGGC AAGG ATTTAG 
A C T C CTC 

1281 TCATCGATTAAGCCATGTTTCAATGTTTCGTTCAGGCTTT 
CCAGG CGC C CAC 

13 21 AGT AATAGTAGTGTAAGTATAATAAGAGCTCC7ATG77CT 
C C TCC G C C C 

CTTGGATACATCG7AG7GC7GAATT7AA7AATA7.AAT7GC 
C G C C- C C C 
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1801 

GACAGATTTGAA7rTAT7CCAG7TAC73CAACACTCGAGG 

C G C 

* .2 1 
- j *t 'j 

1841 

CTGAATATAATCTGGAAAGAGCGCAGAAGGCGG7GAA7GC 

G CCTG C 7 C 

13 30 

1881 

GCTGTTTACGTCTACAAACCAAC7AGGGCrVAAAACAAA7 
CC CCC7G7 C7G TC 

1 '} Z 0 

1921 

GTAACGGATTATCATA7TGA7CAAG7G7CCAATTTAGTTA 
TTC C C CGC 

13 60 

1961 

CGTATTTATCGGATGAATT7TG7C7GGA7GAAAAGCGAGA 

C CC TAGC G C C C C G T 

2000 

2001 

ATTGTCCGAGAAAGTCAAACATGCGAAGCGACTCAGTGAT 
CC T CC T CC- 

■*> ,> • /’S 

— 

2041 

GAACGCAATTTACTCCAAGATTCAAA777CAAAGACA77A 

GA G C CT G C C C C 


2081 

AT AGGCAACCAGAACG i GGGTGGGoCGGAAGTAC AGoGAT 

C G T T C C 

: ::o 
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2121 

2161 

2201 

2241 

2291 

2321 

2361 

2401 

2441 

2481 

2521 

2561 

2501 
2 641* 
2681 

2721 

2 7 51 

2301 


• * • . 

TACCATCCAAGGAGGGGATGACGTATT7AAAGAAAA7TAC 
C CCTGCGGC 

* • • • 

GTCACACTATCAGGTACCTTTGATGAGTGCTATCCAACAT 
CCCATCC CTC 

* * • 

ATTTGTATCAAAAAATCGATGAATCAAAATTAAAAGCCTT 
C CGG GCCC 

TACCCGTTATCAATTAAGAGGGTATATCGAAGATAGTCAA 
CAG CT CC CC 


GACTTAGAAATCTATTTAATTCGCTACAATGCAAAACATG 
CT CCGCAG CGC 

* • • . 

AAACAGTAAATGTGCCAGGTACGGGTTCCTTATGGCCGCT 
GCG C T CC A 

• • • * 

TTCAGCCCAAAGTCCAATCGGAAAGTGTGGAGAGCCGAAT 
T TC C T G T C 

CGATGCGCGCCACACCTTGAATGGAATCCTGACTTAGATT 
AT G G C 

GTTCGTGTAGGGATGGAGAAAAGTGTGCCCATCATTCGCA 
C C C C G C T 

TCATTTCTCCTTAGACATTGATGTAGGATGTACAGACTTA 
C GCG T C G 

AA7GAGGACC7AGG7G7A7GGG7GA7C777AAGAT7AAGA 
C A C C C C 

CGCAAGATGGGCACGCAAGAC7AGGGAA7C7AGAG777C7 
C C A 7 C C 7 

CGAAGAGAAACCA77AG7AGGAGAAGCGC7AGC7CG7G7G 
G C 7 7 C 

AAAAGAGCGGAGAAAAAA7GGAGAGACAAACG7GAAAAA7 
G A G G G G C 

7GGAATGGG AAAC AAA7 A7C G77T A7AAAG A GGC AAAAGA 
C T C CGC 

A7CTG7AGA7GC777A77TG7AAAC7C7CAA7ATGA7CAA 
GCG GCG C G 

TTACAAGCGGATACGAA7A7TGCCATGAT7CA7GCGGCAG 
G CCCCC CCC 

A7AAACG7G77CA7AGCA77CGAGAAGC77A7C7GCC7GA 
C G C 7 G C7 
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2341 

GCTGTCTGTGATTCCGGGTGTCAATGCGGCTATTTTTGAA 

T C CT GCTCCCG 

2880 

2881 

• • 

GAATTAGAAGGGCGTATTTTCACTGCATTCTCCCTATATG 
CTGA CTC TGC 

2920 

2921 

• • . . 
ATGCGAGAAATGTCATTAAAAATGGTGATTTTAATAATGG 

C C CGC CCC 

2960 

2961 

CTTATCCTGCTGGAACGTGAAAGGGCATGTAGATGTAGAA 

C CAG T T G C G G 

3000 

3001 

GAACAAAACAACCAACGTTCGGTCCTTGTTGTTCCGGAAT 

G TG C G G T G 

3040 

3041 

GGGAAGCAGAAGTGTCACAAGAAGT7CGTGTCTGTCCGGG 

T C G A A A 

3080 

3081 

TCGTGGCTATATCCTTCGTGTCACAGCGTACAAGGAGGGA 
AA CTC GCT 

3120 

3121 

TATGGAGAAGGTTGCGTAACCATTCATGAGATCGAGAACA 

C T G G C C 

3150 

3161 

ATACAGACGAACTGAAGTTTAGCAACTGCGTAGAAGAGGA 

C C G T CTC C G A 

3 2 00 

3201 

AATCTATCCAAATAACACGGTAACGTGTAATGA7TATACT 
CC CTTCCCC 

3240 

3241 

GTAAATCAAGAAGAATACGGAGGTGCGTACACTTCTCGTA 

G G G C AGC 

3 2 5 0 

3281 

ATCGAGGATATAACGAAGCTCCTTCCGTACCAGCTGATTA 
CA T C T T C 

3 3 2 0 

3321 

TGCGTCAGTCTATGAAGAAAAATCGTATACAGATGGACGA 
CCGCGG CC CA 

3 5 0 

3361 

AGAGAGAATCCTTGTGAATTTAACAGAGGG7ATAGGGATT 
CT C CGC TC C 

3 4 00 

3401 

ACACGCCACTACCAGTTGGTTATGTGACAAAAGAAT7AGA 

A T C 7CGGCT 

34 40 

3441 

ATACTTCCCAGAAACCGATAAGGTATGGATTGAGAT7GGA 

G TTG CAG C CT 

3 4 30 

3491 

G AAACGG AAGG AAC AT77ATCGTGGAC AGC G7GG AA77AC 

C G C C GC T 

i; ” 

3521 

TCCT7ATGGAGGAA 3534 

7 G 
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1 

ATGACTGCAGATAATAATACGGAAGCACTAGATAGCTCTA 
CCCC CCCT 

40 

41 

• • * * 
CAACAAAAGATGTCATTCAAAAAGGCATTTCCGTAGTAGG 
CTG TCGGTC TG 

80 

81 

• • • . 
TGATCTCCTAGGCGTAGTAGGTTTCCCGTTTGGTGGAGCG 
AC T G GTATCC C 

120 

121 

• • * 
CTTGTTTCGTTTTATACAAACTTTTTAAATACTATTTGGC 

C GAGC C CCCC 

160 

161 

• * • 
CAAGTGAAGACCCGTGGAAGGCTTTTATGGAACAAGTAGA 
CG T AAC G T 

200 

201 

AGCATTGATGGATCAGAAAATAGCTGATTATGCAAAAAAT 
TCT GTA CGC 

240 

241 

AAAGCTCTTGCAGAGTTACAGGGCCTTCAAAATAATGTCG 
GTG ACC GC G 

290 

281 

AAGATTATGTGAGTGCATTGAGTTCATGGCAAAAAAATCC 

G C C TCCAGC G G C 

320 

321 

TGTGAGTTCACGAAATCCACATAGCCAGGGGCGGA7AAGA 

T C CA T C A TA C 

360 

361 

gagctgttttctcaagcagaaagtcattttcgtaattcaa 

T C C TCC C CA A C 

400 

401 

TGCCTTCGTTTGCAATTTCTGGATACGAGGT7CTATTTCT 
AGC T C C T T C 

4 4 0 

441 

AACAACATATGCACAAGCTGCCAACACACAT77A77777A 
CTC T CCGCC 

4 SO 

481 

CTAAAAGACGC7CAAAT77A7GGAGAAGAA7GGGGATACG 

T G C G 

5 2 C 

521 

AAAAAGAAGATATTGCTGAATTTTATAAAAGACAACTAAA 

G GC GCCGCT T 

560 

561 

ACTTACGCAAGAATATACTGACCATTGTG7CAAATGGTAT 

G C C G C C G 

600 

601 

AA7GTTGGA7TAGA7AAA7TAAGAGG77CATC77A7GAAT 

C ICC GC C C7CCG 

6 4 0 

64 1 

C7TGGG7AAAC7TTAACCGT7A7CGCAGAGAGA7GACA77 

G C A A CA G C 
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• * • • 

681 AACAGTATTAGATTTAATTGCACT ATTTCC ATTGTATGAT 
GTGCCCTC C C C 

• • • • 

721 GTTCGGCTATACCCAAAAGAAGTTAAAACCGAATTAACAA 
GAAC G G TGCTC 

• • • . 

761 GAGACGTTTTAACAGATCCAATTGTCGGAGTCAACAACCT 
GC C T C T 

• • • • 

801 T AGGGGCTATGGAACAACCTTCTCTAATAT AGAAAATTAT 
T T AGC C C C 

• * • * 

841 ATTCGAAAACCACATCTATTTGACTATCTGC ATAGAATTC 
AG C C T C 

* ■ • * 

881 AATTTCACACGCGGTTCCAACCAGGATATTATGG AAATG A 
C AA T C T C 

* • t I 

921 CTCTTTCAATTATTGGTCCGGTAATTATGTTTCAACTAGA 
C C C C C 

961 CCAAGC ATAGGATCAAATGATATAATCACATCTCCATTCT 
T T C C C 

1001 ATGGAAATAAATCCAGTGAACCTGTACAAAATTTAGAATT 
TCG GGCCTG 

1041 TAATGGAGAAAAAGTCTATAGAGCCGTAG r AAATAC.VAT 
C C C G C C C 

1081 CTTGCGGTCTGGCCGTCCGCTGTATATTCAGGTGTTACAA 
CTG A ATC CC' 

1121 AAGTGGAATTTAGCCAATATAATGATCAAACAGATGAAGC 
G G TG C GC G 

1161 AAGTACACAAACGTACGACTCAAAAAGA* VTGTTGGCGCG 
CCCGT CCTC A 

1201 GTCAGCTGGGATTCTATCGATCAATTGCCTCCAGAAACAA 
TCT C C 

1241 C AGATGAACCTCT AG AAAAGGGAT ATAGCC ATC AACTC AA 
C AT G G C C C T 

12 81 TTATGTAATGTGCTTTTTAATGCAGGGTAGTAGAGG AACA 

C G C G A TCC G C 

13 21 ATCCCAGTGTTAACTTGGACACATAAAAGTGTAGACTTTT 

T G C C GTCC G C 

13 61 TTAACATGATTGATTCGAAAAAAATTACACAACTTCCGTT 
C C AGC G G C T C 


FIGURE 12B 
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• * • • 

1401 AGTAAAGGCATATAAGTTACAATCTGGTGCTTCCGTTGTC 14 40 

G G A C C C G 

1441 GCAGGTCCTAGGTTTAC AGGAGGAGATATC ATTC AATGC A 1480 

CACT TC CG 

1481 CAGAAAATGGAAGTGCGGCAACTATTTACGTTACACCGGA 1520 

GCCCAT C G T 

1521 TGTGTCGTACTCTCAAAAATATCGAGCTAGAATTCATTAT 15 60 

T G G CA G AC T C 

1561 GCTTCTACATCTCAGATAACATTTACACTCAGTTT AGACG 1600 

A CAGC C C C C G T 

• • • • 

1601 GGGCACCATTTAATCAATACTATTTCGATAAAACGATAAA 1540 

A CCCGTCTCGCC 

• • • t 

1641 T AAAGG AG AC AC ATTAAC GT AT AATTC ATTT AA7TT AGC A 1630 

C T TC C A C AGC C C G 

1681 AGTTTCAGCACACCATTCGAATTATCAGGGAATAACTTAC 17 20 

T C C C C TC T 

1721 AAATAGGCGTCACAGGATTAAGTGCTGGAGATAAAGTTTA 17 60 

GC CTCCCC C C 

17 61 T ATAG ACAAAATTG AATTTATTCCAGTG AAT 1 ~ 9 1 
C C G G C C C 


FIGURE 12C 
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1 ATG AATAATGTATTGAATAGTGGAAGAACAACTATTT 40 

GAC C C C CTC T C C 

• * * * 

41 GTGATGCGTATAATGTAGTAGCCCATGATCCATTTAGTTT 80 

CCACCCGTC CC' 

• • • * 

81 TGAACATAAATCATTAGAT ACCATCCAAAAAGAATGG ATG 120 

C C GAGCC C C T T G G G 

• • • • 

121 G AGTGGAAAAGAACAGATC ATAGTTTATATGTAGCTCCTG 160 

A C T T C CTC C C C C A 

161 TAGTCGGAACTGTGTCTAGTTTTTTGCTAAAGAAAGTGGG 200 

GT A CCCCTC GC 

201 GAGTCTTATTGGAAAAAGGATATTGAGTGAATTATGGGGG 24 0 

CTC C C CTC TCC C C T 

241 ATAATATTTCCTAGTGGTAGTACAAATCTAATGCAAGATA 280 

C C ATC GTCC T C C 

281 TTTTAAGGGAGACAGAACAATTCCTAAATCAAAGACTTAA 320 

CG C GTCCGCTC 

321 TACAGATACCCTTGCTCGTGTAAATGCAGAATTGATAGGG 3 60 

CT TG AACCTG CT 

3 61 CTCCAAGCGAATATAAGGGAGTTTAATCAACAAGTAGATA 4 00 

ACTCT CCG GC 

401 ATTTTTTAAACCCTACTCAAAACCCTGTTCCTTTATCAAT 4 40 

CCGTA GT G CTC 

441 AACTTCTTCGGTTAATACAATGCAGCAATTATTTCTAAAT 4 3C 

C CGCT CCCCC 

481 AGATTACCCCAGTTCCAGATACAAGGATACCAGTTGTTAT 520 

G T T T C • C CC 

521 T ATT ACCTTT ATTTGC AC AGGC AGC C AAT AT GCATCTTTC 560 

TC T AC C T T C CT G 

5 61 TTTTATTAGAGATGTTATTCTTAATGCAGATGAATGGGGT 600 

C C ACTCGCCCTC A 

601 ATTTCAGCAGCAACATTACGTACGTATCGAGATTACCTGA 64 0 

C T C TC TA G A CA C T 

64 1 GAAATTATACAAGAGATTATTCTAATTATTGTATAAATAC 680 

G C C TC T CCC CCC 
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• • • * 

681 GTATCAAACTGCGTTTAGAGGGTTAAACACCCGTTTACAC 720 

T G C C T AC C T TA GC T 

* • * • 

721 G ATATGTTAG AATTTAG AACATATATGTTTTTAAATGTAT 7 60 

C CTGCGCC CCTCG 
* • • • 

7 61 TTGAATATGTATCCATTTGGTCATTGTTTAAATATCAGAG 800 

G C CAG AGTC C C G C 

* • • • 

801 TCTTATGGTATCTTCTGGCGCTAATTTATATGCTAGCGGT 84 0 

CTG GC AC CCC CTCT C 

• * * * 

841 ACTGGACCACAGCAGACACAATCATTTACAGCACAAAACT * 380 

A T GAGC C T G 

881 GGCCATTTTTATATTCTCTTTTCCAAGTTAATTCGAATTA 920 

C G AGCT G C C- C C 

921 TATATTATCTGGTATTAGTGGTACTAGGCTTTCTATTACC 960 

C TC CAG CTC G C A C C A 

961 TTCCCTAATATTGGTGGTTTACCGGGTAGTACTACAACTC 1000 

T C C AC T A CTCC C 

1001 ATTCATTGAATAGTGCCAGGGTTAATTATAGCGGAGGAGT 1 ? 4 0 

AGCC T CTC A G C C T T 

1041 TTCATCTGGTCTCATAGGGGCGACTAATCTCAATCACAAC 1C 3 0 

CAGC A.T G T T A CT G C 

1081 TTTAATTGCAGC ACGGTCCTCCCTCCTT7ATCAACACCAT 

C TC C T G A C GAGC G 

1121 TTGTTAG AAGTTGGCTGG ATTCAGGTACAGATCG AGAGGG 11 -i 0 

G GTCC T CAGC T C A 

1161 CGTTGCT ACCTCTACGAATTGGCAGACAGAATCCTTTCAA 10 1 0 

A A C A C G C 

1201 ACAACTTTAAGTTTAAGGTGTGGTGCTTTTTCAGCCCGTG 1C 40 

CCTCCTC A C T A 

1241 GAAATTCAAACTATTTCCCAGATTATTTTATCCGTAATAT 12 30 

G CT CCC TA G C 

12 81 TTCTGGGGTTCCTTTAGTTATTAGAAACGAAGATCTAACA 13 0 0 

C T CCCCGT CCC 

13 21 AGACCGTTACACT ATAACCAAATAAGAAATATAG AAAGTC 1 _• v 0 

CTACTTC GTGCC GTC 

13 61 CTTCGGGAACACCTGGTGGAGCACGGGCCTATTTGGTATC 14 00 

ACTTAAT AATCCCG 
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1401 

* • • • 

TGTGCATAACAGAAAAAATAATATCTATGCCGCTAATGAA 
C. GGCC CTCCG 

1440 

1441 

• • • • 

AATGGTACTATGATCCATTTGGCGCCAGAAGATTATACAG 
CC TCCTA CT 

1480 

1481 

» 1 « t 

GATTTACTATATCGCCAATACATGCCACTCAAGTGAATAA 
CCCT C T C ■ C 

1520 

1521 

T CAAACTCGAACATTTATTTCTG AAAAATTTGG AAAT C AA 
GACCCCC GC 

1560 

1561 

* • * • 

GGTGATTCCTTAAGATTTGAACAAAGCAACACGACAGCTC 

C GGCGTC T C A 

1600 

1601 

G TTAT AC GCTT AG AGGG AATGGAAAT AGTT AC AATCTTT A 
GCTTG C CC C 

1640 

1641 

TTTAAGAGTATCTTCAATAGGAAATTCAACTATTCGAGTT 

C G TAGC CTTCCCCT 

1680 

1681 

ACTATAAACGGTAGAGTTTATACTGTTTCAAATGTTAATA 
CC ACT CACT GC 

1720 

1721 

C C ACT AC AAAT AACG ATGG AGTT AATG AT AATGG AGC TCG 
TAGCT C .CCC CA 

17 60 

1761 

TTTTTCAGATATTAATATCGGTAATATAGTAGCAAGTGAT 

A CAGC CCCTCCCGClC C 

1900 

1901 

AATACTAATGTAACGCTAGATATAAATGTGACATTAAACT 

C CTTTGCC CCCT 

1 3 u 

1841 

CCGGTACTCCATTTGATCTCATGAATATTATCTTTGTGCC 

T A C C 

I960 

1881 

AACTAATCTTCCACCACTTTAT 190 2 

C C T T G C 
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• • * • 

1 ATGGAGGAAAATAATCAAAATCAATGCATACCTTACAATT 40 

G C C C T A C 

41 GTTTAAGTAATCCTGAAGAAGTACTTTTGGATGGAGAACG 80 

C G C A G T GC T 

* • • i 

81 GATATCAACTGGTAATTCATCAATTGATATTTCTCTGTCA 120 

CT C CTCCCCCT C 

• * * « 

121 CTTGTTCAGTTTCTGGTATCTAACTTTGTACCAGGGGGAG 160 

T G C CAGC C G T T 

• • . . 

161 GATTTTTAGTTGGATTAATAGATTTTGTATGGGGAATAGT 200 

GCCTC C TCCC TC 
• . . . 

201 TGGCCCTTCTCAATGGGATGCATTTCTAGTACAAATTGAA 240 

T A C G G G 

241 CAATTAATTAATG AAAGAATAGCTGAATTTGCTAGG AATG 280 

GGCCGGC GCC C 

281 CTGCTATTGCTAATTTAGAAGGATTAGGAAACAATTTCAA 320 

CC CG GCTC 

321 TATATATGTGGAAGCATTTAAAGAATGGGAAGAAGATCCT 3 60 

CC GCC G GC 

3 61 AATAATCCAGAAACCAGGACCAGAGTAATTGATCGCTTTC 4 CO 

C G CCTGGCCAA CA 

401 GTATACTTGATGGGC7ACTTGAAAGGGACATTCC7TCGT? 4 4 0 

AC7GCCC7GGA7CAC 

4 41 7CGAA777C7GGA777GAAG7ACCCC7777A7CCG777AT 4 80 

CA C CC 7 7 C G GC 

481 GC7CAAGCGGCCAA7CTGCA7C7AGC7A7A77AAGAGAT7 500 

A 7 7 C C CC TC CA 

521 C7GTAAT777TGGAGAAAGATGGGGA77GACAACGA7AAA 5 60 

GCC G G C 7 C 

561 7GTCAATGAAAAC7A7AA7AGACTAAT7AGGCA7A77GAT 600 

C G7CC TC C C 

601 GAATATGCTGATCACTGTGCAAA-TACGTATAATCGGGGAT 64 0 

GCCC TCCCCTC 

641 TAAATAATTTACCGAAATCTACGTATCAAGArTGGATAAC 560 

GCCC7G T T 

681 AT AT AATCG ATT ACGG AG AG AC TT AAC ATTG AC TGT ATT A 7 20 

C C CA G GA G CC C A T G 
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721 

GATATCGCCGCTTTCTTTCCAAAC7ATGACAA7AGGAGA7 

C T A C G C 

760 

761 

• • « 6 

ATCCAATTCAGCCAGTTGGTCAACTAACAAGGGAAGTTTA 
CTCA G T C A C 

800 

801 

• • • • 

TACGGACCCATTAATTAATTTTAATCCACAGTTACAGTCT 

T CT CCCT G AAG 

840 

841 

GTAGCTCAATTACCTACTTTTAACGTTATGGAGAGCAGCC 
CCCTCAC C TC 

880 

881 

• • * • 
GAATTAGAAATCCTCATTtATTTGATATATTGAATAATCT 
TCGC ACG CC CC 

920 

921 

TACAATCTTTACGGATTGGTTTAGTGTTGGACGCAATTTT 

T CC CC GTCC 

960 

961 

TATTGGGGAGGACATCGAGTAATATCTAGCCTTATAGGAG 

T CA G C C CTCT * T 

1000 

1001 

GTGGTAACATAACATCTCCTATAfATGGAAGAGAGGCGAA 

G T C C C T A 

1040 

1041 

CCAGGAGCCTCCAAGATCCTTTACTTTTAATGGACCGGTA 

A C TAGT C C C C T A C 

1090 

1081 

777AGGAC777A7CAAA7CC7AC777ACGA77A7TACAGC 

C A C G T C C GA GC C . 

1120 

1121 

AACCTTGGCCAGCGCCACCATTTAAT7TACGTGGTGTTGA 

7 7 C CC 7A A 

11 6C 

1161 

AGGAG7AGAA7777C7ACACC7ACAAA7AGC777ACG7A7 

G C 7 G C 7 C C7C C T C 

■* -> A 

1 w V O' 

1201 

CGAGGAAGAGG7ACGG77GA77C777AACTGAATTACCGC 

A 7 AC C G C CCA 

i:4c 

1241 

C7GAGGA7AA7AG7G7GCCACC7CGCGAAGGA7A7AG7CA 
ACC CA G C C7CC 

1230 

1281 

7CG77TA7G7CA7GCAAC7777G77CAAAGA7CTGGAACA 
CAGGCC CCGGC7C 7 

1320 

1321 

CC777777AACAAC7GG7G7AG7A77TTCTTGGACCGATC 

A CC C7AA7GCA T 

13 60 

1361 

GTAGTGC AACTCTTACAAA7ACAATTGATCCAGAGAGAAT 

T C 7 C C G 

14 00 


/ 
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1401 

* * # • 

TAATCAAATACCTTTAGTGAAAGGATTTAGAGTTTGGGGG 

C CAGCGTCCTG A 

1440 

1441 

• • . • 

GGCACCTCTGTCATTACAGGACCAGGATTTACAGGAGGGG 
AT C C C T 

1480 

1481 

• • . • 

ATATCCTTCGAAGAAATACCTTTGGTGATTTTGTATCTCT 

T A C T C C GAGC 

1520 

1521 

ACAAGTCAATATTAATTCACCAATTACCCAAAGATACCGT 

C TCCCT T T 

1560 

1561 

TTAAGATTTCGTTACGCTTCCAGTAGGGATGCACGAGTTA 

C C G A ' TTCCC T C TA C 

1600 

1601 

* * • * 

TAGTATTAACAGGAGCGGCATCCACAGGAGTGGGAGGCCA 

CGCCCCATTCTCTA 

1640 

1641 

* * * * 

AGTTAGTGTAAATATGCCTCTTCAGAAAACTATGGAAATA 
CTCC G C AC G G C 

1680 

1681 

GGGGAGAACTTAACATCTAGAACATTTAGATATACCGATT 

C G CGCC C C 

1720 

1721 

TTAGTAATCCTTTTTCATTTAGAGCTAATCCAGATATAAT 
CTC C CAGT CC T C C T C C 

17 60 

1761 

TGGGATAAGTGAACAACCTCTAVTTGGTGCAGGTTCTATT 
CTC C AT AGC C 

1900 

1801 

agtagcggt^aactttatatagat.aaaattgaaattattc 

TCATCT C TGCTCG GC 

19-40 

1841 

T AGC AG ATGC AAC ATTTG AAGC AG AATCTG ATTTAG AAAG 
TCCTCCCGTG ACA CC T G 

19.9 0 

1881 

ACCACAAAAGGCGGTGAATCCCCTGTTTAC7TCT7CCAAT 

C G T C C C CA 

• 

1 ?2 ) 

1921 

CAAATCGGGTTAAAAACCGATGTGACGGATTATCATATTG 
GCTCG TACTTC C 

1 960 

1961 

ATCAAGTATCCAATTTAGTGGATTGTTTATCAGATGAATT 

C G C G CACC ACC TAGC G 

2000 

2001 

7TG7CTGGATGAAAAGCGAGAATTGTCCGAGAAAGTCAAA 
CCCCG TCC T 

*- u 'i U 

2041 

CATGCGAAGCGAC7CAGTGATGAGCGGAA777AC7TCAAG 
CC T CCA CC7G 

4 _ _ ^ \ 

2081 

ATCCAAACTTCAGAGGGA7CAATAGACAACCAGACCGTGG 
CT C A AC C G G A 

Z L _ ' Z 
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2121 CTGGAG AGGAAGTACAGA7A77ACCA7CCAAGG AGG AGA7 
TGT CCGGC CC 

* • * ■ 

2161 GACGTATTCAAAGAGAATTACGTC ACACTACCGGGT ACCG 
TG G C CCTCATT 

• • • • 

2201 TTGATGAGTGCTATCCAACGTATTTATATC AG AAAATAG A 
CC CTCCGC GC 

• • • • 

2241 TGAGTCGAAATTAAAAGCTTATACCCGTTATGAATTAAGA 
C CC CTC AG CCT 
• • • • 

2281 GGGTATATCGAAGATAGTCAAGACTTAGAAATCTATTTGA 
CC CC CT CC 


2321 TCCGTTACAATGCAAAACACGAAATAGTAAATGTGCCAGG 
AG CG G CC G C 

* • • • 
2361 CACGGGTTCCTTATGGCCGCTTTCAGCCCAAATGCCAATC 
T T C C A T TCT C T 

2401 GGAAAGTGTGGAGAACCG AATCGATGCGCGCCACACCTTG 
G G T CA T 

2441 AATGGAATCCTGATCTAGATTGTTCCTGCAGAGACGGGGA 
G CTGCC GTC 

24 81 AAAATGTGCAC A7CA77CCCATCAT7TCACC7TGGATA77 
GG CC T CT CC 

2521 GATGTTGGATGTAC AGACTTAAA7GAGGAC7TAGGTG7AT 
G 7 C G CCAC 

2 5 61 GGG7GA7A77CAAGA77AAGACGCAAGA7GGCCA7GCAAG 
C C C C C A C 

2 601 AC7AGGGAA7C7AGAG77TC7CGAAGAGAAACCAT7A77A 
7 C C 7 GG C 

2641 GGGGAAGCAC7AGC7CG7G7GAAAAGAGCGGAGAAGAAGT 
7 T C G A 

2681 GGAGAGACAAACGAGAGAAACTGCAG77GGAAACAAA7AT 
G 7 CG A G 7 C 

27 21 7G77TATAAAGAGGCAAAAGAATC7GTAGATGCTTT ATTT 
C CG C GCG GC 
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2801 7CGCCATGA7TCA7GCGGCAGA7AAACGCG77CATAGAA7 
CCC C 7GCC 
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2041 

CCGGGAAGCGTATCTGCCAGAGTTGTCTGTGATTCCAGGT 
TTGTCT T C CT 

2880 

2881 
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GTCAATGCGGCCATTTTCGAAGAATTAGAGGGACGTATTT 
GCT C GCT C 

2920 

2921 
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TTACAGCGTATTCCTTATATGATGCGAGAAATGTCAT7AA 
CATC GC C C C 

2960 

2961 

• • • • 

AAATGGCGATTTCAATAATGGCTTATTATGCTGGAACGTG 

G C T C C C CAGC T 

3000 

3U01 

* • • * 

AAAGGTCATGTAGATGTAGAAGAGCAAAACAACCACCGTT 
GCGGAG TG 

3040 

3041 

• ■ t t 

CGGTCCTTGTTATCCCAGAATGGGAGGCAGAAG7G7CACA 

C GGGTG AT C 

3080 


3081 AGAGGTTCGTGTCTGTCCAGGTCGTGGCTATATCCTTCGT 3120 


ft A A A C T C 

3121 GTCACAGCATATAAAGAGGGATATGGAGAGGGCTGCGTAA 3160 

GCTCG CT T G 

3161 CGATCCATGAGATCGAAGACAATACAGACGAACTGAAATT 3200 

C C GACC G 7 G 

3 201 CAGCAACTGTGTAGAAGAGGAAGTATATCCAAACAACACA 32 4 0 

TC CC.GAAC C C 

324 1 GTAACGTGTAATAATTATACTGGGACTCAAGAAGAA7A7G 32 30 

TTCCGCC TAG GC 

3 281 AGGGTACGTACACTTCTCGTAATCAAGGATA7CACGAAGC 3 3 2 0 

GA G C AGC CAG T CA 

3 321 CTATGGTAATAACCCTTCCGTACCAGCTGAT7ACGC77CA 3 3 60 

TCC TCXXXXXXXXXXXX T T C 7 C C 

3 3 61 GTCTATGAAGAAAAATCGTATACAGATGGACGAAGAGAGA 34 00 

GCGG CC CA C 7 

3 401 ATCCTTGTGAATC7AACAGAGGCTATGGGGAT7ACACACC 34 40 

C C G TC T CA C 

344 1 ACTACCGGCTGGTTATGTAACAAAGGATT7AGAG7AC7TC 3480 

TAT C TC GC T T 

3 4 3 1 CCAGAG ACCGAT AAGGTA7GG ATTGAGA7CGGAGAAACAC 3 5 20 

T C A G C 7 C 

3 521 AAGGAACATTCATCGTGGATAGCG7GGAAT7ACTCC7TAT 3 5 60 

G C C GC T T G 


GGAGGAA 3567 
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1 AGATCTAGAGGTAATTGTTATGAGTACTGTCGTGGTTAAG 

GATC 

41 GGAAACGTCAACGGTGGTGTACAACAACCTAGAAGGAGGA 
G T A 

81 GAAGGCAATCCCTTCGCAGGAGGGCTAACAGAGTAC AGCC 

T A T 

121 AGTGGTT ATGGTC ACTGCTCCTGGCGAACCCAGGAGG AGG 

GC A A A 

161 AGACGCAGAAGAGGAGGCAATCGCAGGTCAAGAAGAACTG 
AG T A 

201 GAGTTCCCAGGGG AAGGGGCTCAAGCG AGAC ATTCGTGTT 
A AT 

2 41 TACAAAGGACAACCTCGTGGGCAACTCCCAAGG AAGTTTC 

2 81 ACCTTCGGACCAAGTGTATCAGACTGTCCAGCATTCAAGG 

T 

321 ATGGAATACTCAAGGCCTACCATGAGTACAAGATCAC AAG 

T 

3 61 TATCCTTCTTCAGTTCGTC AGCGAGGCC7CT7CCACC7C A 

T G T 

4 01 CCAGGATCCATCGC ITATGAGTTGGACCCACAT7GCAAAG 

C AT 

4 4 1 7A7CA7CCC7CCAG7CC7ACGTCAACAAGTTCCAAATCAC 
T 

4 81 AAAGGGAGGAGCTAAGACCTATCAAGCTAGGA7GATCAAC 

T T C T 

521 GGAGTAG AATGGC ACGATTCATCTGAGGATC AGTGC AGG A 
T T A 

5 61 TACTTTGGAAAGG AAGTGG AAAATCTTCAGACCCAGC AGG 

C A G T T 

601 ATCTTTCAGAGTC ACCATCAGAGTGGCTCTTCAAAACCCC 

T T A 

641 AAG7AA7 AGAC7CCGG ATCAGAGCC70GTCC AAGCCC AC A 
A T 

FIGURE 16A. 
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681 ACCAACACCCACTCCAACTCCCCAAAAGCATGAGCGATTT 720 

• • • • 

721 ATTGCTTACGTCGGCATACCTATGCTGACCATTCAAGAAT 7 60 

761 TC 762 
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